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Abstract 

This  thesis  develops  and  analyzes  a  mathematical  model  for  the  reliability 
measures  of  a  single-unit  system  subject  to  continuous  wear  due  to  its  operating 
environment  and  randomly  occurring  shocks  that  inflict  a  random  amount  of  dam¬ 
age  to  the  unit.  Assuming  a  Markovian  operating  environment  and  shock  arrival 
mechanism,  Laplace-Stieltjes  transform  expressions  are  obtained  for  the  failure  time 
distribution  and  all  of  its  moments.  Moreover,  an  analytical  expression  is  derived  for 
the  long-run  availability  of  the  single-unit  system  when  it  is  subject  to  an  inspect- 
and-replace  maintenance  policy.  The  analytical  results  are  illustrated,  and  their 
results  compared  with  those  of  Monte  Carlo-simulated  failure  data.  The  numeri¬ 
cal  results  indicate  that  the  reliability  measures  may  be  accurately  computed  via 
numerical  inversion  of  the  transform  expressions  in  a  straightforward  manner  when 
the  input  parameters  are  known  a  priori.  In  stark  contrast  to  the  simulation  model 
which  requires  several  hours  to  obtain  the  reliability  measures,  the  analytical  proce¬ 
dure  computes  the  same  measures  in  only  a  few  seconds. 
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ANALYTICAL  RESULTS  FOR  A  SINGLE-UNIT  SYSTEM 
SUBJECT  TO  MARKOVIAN  WEAR  AND  SHOCKS 


1.  Introduction 

1.1  Background 

Throughout  its  history,  the  science  of  reliability  has  been  intertwined  with 
military  applications.  The  formal  mathematics  of  reliability  theory  was  developed 
during  the  second  World  War  to  examine  the  high  failure  rates  observed  in  the 
military  systems  of  that  era.  Military  applications  continued  to  drive  the  growth  of 
the  field  through  the  1950’s  and  1960’s.  As  the  cold  war  between  the  United  States 
and  the  Soviet  Union  intensified,  the  national  defense  strategy  of  mutually  assured 
destruction  required  the  military’s  nuclear  weapons  program  to  obtain  unprecedented 
levels  of  reliability.  To  that  end,  national  efforts  were  made  within  the  military  and 
academic  communities  to  grow  the  state  of  the  art  in  reliability  theory.  Those  efforts 
produced  the  amalgam  of  probabilistic  and  statistical  techniques  used  in  modern 
reliability  analysis. 

As  a  vast  application  area  of  probability,  reliability  theory  studies,  measures 
and  analyzes  system  failures  and  repairs  in  order  to  improve  their  operational  use 
[8] .  While  the  pioneering  work  focused  on  military  applications,  today’s  competitive 
global  economic  environment  has  broadened  the  scope  of  reliability  research.  To  re¬ 
main  competitive,  organizations  must  become  more  efficient  while  offering  superior 
products.  Manufacturers  are  required  to  produce  reliable  products,  through  reliable 
processes;  a  business  cannot  remain  economically  viable  if  its  production  capabilities 
are  constantly  in  disrepair.  Analogously,  the  armed  forces  cannot  conduct  effective 
combat  operations  if  their  weapon  systems  are  unreliable.  For  this  reason,  the  mod- 
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ern  United  States  military  has  mandated  that  its  equipment  operate  reliably  during 
peace  time  and  combat  operations. 

For  new  systems  in  the  acquisition  process,  this  emphasis  is  reflected  in  the 
language  contained  in  the  Federal  Acquisition  Regulation  (FAR).  Federal  law  requires 
acquisition  program  managers  to  develop  and  document  a  strategy  for  continuous 
improvement  of  product  reliability  and  availability  while  sustaining  readiness  [9]. 
While  these  regulations  only  apply  to  new  system  acquisitions,  the  reliability  and 
supportability  of  legacy  systems  is  becoming  an  even  more  important  issue.  Scarce 
resources  are  forcing  the  military  to  use  systems  longer  than  they  were  initially 
intended.  The  reliability  of  these  aging  systems  is  the  main  factor  in  extending  their 
service  life.  For  example,  the  reliability  of  the  B-52H  airframe  was  the  main  concern 
of  its  service  life  extension  program,  and  as  a  result  the  B-52H,  which  first  flew  in 
1962,  will  remain  in  service  until  2040.  However,  the  military  cannot  ensure  their 
systems  will  meet  these  reliability  requirements  unless  adequate  methods  exist  to 
measure  their  reliability. 

Reliability  is  defined  by  Ebeling  [9]  as  the  probability  that  a  system  will  per¬ 
form  a  required  function  for  a  period  of  time  under  normal  operating  conditions. 
This  definition  requires  that  those  conditions,  and  the  function  it  is  expected  to 
perform,  be  specified.  A  system’s  maintainability  and  availability  are  associated 
with  its  reliability,  and  collectively,  these  three  measures  are  referred  to  as  reliability 
performance  measures. 

These  metrics  are  very  important  for  one  reason:  real-world  systems  fail.  Some 
failures  are  minor  and  result  in  inconveniences,  minor  damage  and  small  economic 
loss,  while  other  failures  are  catastrophic  and  cause  personal  and  corporate  ruin. 
Recent  history  provides  examples  of  systems  whose  failures  have  caused  tremendous 
economic  and  personal  loss.  The  Tacoma  Narrows  bridge  fell  into  the  Puget  Sound 
on  November  8,  1940  because  engineers  did  not  fully  understand  the  effects  of  metal 
fatigue  on  the  bridge’s  reliability  [9].  The  collapse  of  the  bridge  cost  millions  of 
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dollars,  while  the  1986  Space  Shuttle  Challenger  tragedy  cost  the  lives  of  all  seven 
astronauts  on  board.  The  cause  of  the  explosion  was  the  failure  of  a  simple  0-ring 
[9].  In  the  modern  U.S.  military,  systems  are  becoming  more  complex.  Greater 
emphasis  is  being  placed  on  holding  reliable  and  maintainable  weapons  systems. 
When  a  military’s  systems  fail  in  combat,  troops’  lives  are  endangered.  Simply  put, 
the  best  weapon  is  useless  if  it  cannot  be  effectively  employed  because  it  is  unreliable. 

Traditionally,  the  military  has  favored  statistical  techniques  in  measuring  weapon 
system  reliability.  This  method  uses  historical  failure  time  observations  to  create  em¬ 
pirical  measures  of  a  system’s  reliability.  For  components  that  are  mass  produced 
at  relatively  low  cost,  this  approach  is  feasible  because  the  components  can  be  de¬ 
structively  tested  to  obtain  the  required  data.  Unfortunately,  many  (if  not  most)  of 
the  systems  employed  by  the  U.S.  military  are  not  of  this  nature.  An  alternative  ap¬ 
proach  is  to  use  probabilistic  models  to  analytically  derive  reliability  measures.  This 
approach  dehnes  an  abstract  model  which  consists  of  a  collection  of  mathematical 
assumptions  to  evaluate  the  reliability  of  any  system  that  meets  the  assumptions  of 
the  model. 

The  probabilistic  approach  is  more  attractive  than  the  statistical  approach  for 
two  main  reasons.  First,  a  system’s  reliability  measures  can  be  determined  generally 
as  opposed  to  those  using  the  statistical  approach  which  are  specihc  to  a  single 
system.  The  probabilistic  approach  builds  a  general  model  that  may  be  used  to 
hnd  the  reliability  of  an  entire  class  of  systems.  Second,  the  probabilistic  approach 
facilitates  the  explicit  modelling  of  the  effect  of  the  system’s  operating  environment 
on  its  reliability.  In  contrast,  this  inherent  dependence  is  not  explicitly  contained  in 
failure  time  data  used  in  the  statistical  approach. 

Stochastic  modelling  is  a  vital  tool  when  employing  the  probabilistic  approach 
to  reliability  analysis.  Although  the  development  of  such  an  analytical,  stochastic 
model  is  difficult,  a  properly  constructed  model  may  accurately  assess  the  system’s 
failure  dynamics.  In  this  thesis,  the  tools  of  stochastic  modelling  are  used  to  derive 
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the  reliability  measures  of  a  system  subject  to  environment-dependent,  continuous 
linear  wear  and  random  shocks.  Currently,  the  probabilistic  models  that  exist  in  the 
literature  do  not  consider  systems  subject  to  this  type  of  failure  mechanism. 

The  single-unit  system  accumulates  damage  until  the  damage  exceeds  a  maxi¬ 
mum  tolerable  level,  at  which  time  the  system  fails.  The  cumulative  damage  can  be 
attributed  to  two  separate  processes.  The  hrst  is  an  environment-dependent  wear 
process  and  the  second  is  an  independent  shock  process.  An  environment-dependent 
wear  process  is  one  in  which  the  rate  of  wear  accumulation  depends  on  the  state  of  the 
operating  environment  of  the  system.  An  example  of  a  environment-dependent  wear 
process  is  a  machine  whose  normal  operating  conditions  may  include  two  states: 
a  low  capacity  mode  and  a  full  capacity  mode.  The  machine  sustains  wear  at  a 
higher  rate  when  the  system  operates  in  the  full  capacity  mode.  The  wear  rates 
and  duration  of  time  spent  in  each  state  determine  how  long  the  system  operates 
effectively  before  a  failure.  The  wear  process  reflects  the  impact  a  system’s  operating 
environment  has  on  its  reliability. 

The  other  contributor  to  system  degradation  is  an  independent  shock  process. 
While  the  system  is  constantly  accumulating  damage  due  to  wear,  shocks  occur  at 
random  intervals  causing  additional  damage.  The  time  between  shocks  is  a  random 
variable,  and  each  time  a  shock  occurs,  a  random  amount  of  damage  is  inflicted.  The 
total  damage  caused  by  shock  and  wear  over  time  may  be  modelled  as  a  stochastic 
process.  An  example  of  a  system  that  accrues  damage  in  this  manner  is  the  tire 
on  an  aircraft  landing  gear.  Consider  the  tire  in  two  operating  environments,  high 
speed  (take  off  and  landing)  and  low  speed  (taxi).  When  the  aircraft  operates  at 
a  high  speed,  the  tire  wears  at  a  greater  rate  than  when  it  operates  at  the  slower 
taxi  speed.  This  represents  the  state-dependent  wear  process.  Further  assume  that 
the  instant  the  plane  lands,  the  tire  incurs  a  random  amount  of  damage  due  to  the 
shock  of  landing.  Because  sortie  durations  are  random,  the  sequence  of  landings,  and 
their  associated  damage  magnitudes  may,  be  considered  as  the  shock  process.  The 
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total  cumulative  damage  to  the  tire  is  the  sum  of  the  wear  and  shocks.  If  the  tire 
must  be  replaced  when  the  total  cumulative  damage  exceeds  a  fixed  threshold,  (i.e. 
a  control  limit)  then  the  probability  that  it  must  be  replaced  during  some  period  of 
time  is  the  reliability  of  the  tire.  Currently,  there  is  not  a  probabilistic  method  to 
evaluate  the  tire’s  reliability.  Using  an  appropriate  failure  time  model  it  is  possible 
to  determine  the  frequency  with  which  preventative  maintenance  should  occur,  and 
the  appropriate  number  of  spares  that  should  be  maintained.  Furthermore,  if  the 
cost  structure  of  this  process  is  known,  it  is  possible  to  develop  a  replacement  policy 
that  can  maximize  availability  while  minimizing  the  overall  maintenance  cost. 

In  this  thesis,  the  reliability  measures  for  a  single-unit  system  that  accumulates 
damage  over  time  due  to  the  influence  of  a  random  environment  and  the  random 
occurrence  of  shocks  are  investigated.  More  specihcally,  the  failure  time  distribution, 
the  mean  time-to-failure,  and  all  other  moments  are  derived,  as  well  as  the  long-run 
availability  of  such  a  system  when  it  is  maintained  under  an  inspect-and-replace 
maintenance  policy.  In  application,  this  research  will  assist  civilian  and  military 
analysts  to  accurately  evaluate  the  reliability  and  availability  of  their  systems.  The 
ability  to  compute  these  reliability  measures  will  ultimately  allow  decision  makers  to 
quantify  the  risk  associated  with  operating  a  system  over  a  prescribed  time  horizon, 
providing  useful  insight  into  meaningful  real-world  problems  and  expanding  the  cur¬ 
rent  knowledge  of  reliability  theory.  In  the  next  section  the  formal  problem  definition 
will  be  provided,  as  well  as  a  road  map  for  the  proposed  solution  methodology. 

1.2  Problem  Definition  and  Methodology 

The  reliability  and  long-run  availability  of  a  system  that  is  subject  to  contin¬ 
uous,  state-dependent,  linear  wear  and  random  shocks  is  considered.  Previous  re¬ 
search  has  investigated  systems  that  incur  damage  caused  solely  by  wear  or  shocks. 
Currently  there  exist  analytical  expressions  for  the  reliability  of  these  components; 
however,  very  little  research  has  been  done  on  systems  that  simultaneously  incur 
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damage  from  both  continuous  wear  and  shocks.  Where  some  results  exist,  they  are 
complicated  multi-dimensional  transform  solutions  that  require  sophisticated  nu¬ 
merical  inversion  algorithms.  These  techniques  are  computationally  expensive  and 
make  it  difficult  to  efficiently  evaluate  the  reliability  and  availability  of  even  simple 
systems.  The  main  contribution  of  this  thesis  is  to  develop  a  simple  closed-form 
solution  in  a  single  transform  dimension.  The  numerical  inversion  techniques  for  a 
one  dimensional  transform  are  more  expedient  and  easier  to  implement.  Previous 
research  indicates  that  one-dimensional  inversion  may  be  up  to  400  times  faster  for 
a  even  simple  system. 

The  reliability  of  a  system  is  the  probability  that  it  will  survive  for  a  given 
length  of  time  and  the  complementary  probability  is  the  chance  it  will  fail  during 
that  time.  Computing  a  system’s  reliability  directly  can  be  difficult  and  it  is  often 
easier  to  derive  the  failure  time  distribution.  In  this  thesis,  it  will  be  shown  that  the 
failure  time  distribution  satishes  a  system  of  linear,  hrst-order,  partial  differential 
equations  that  may  be  solved  via  Laplace  transforms.  Once  the  reliability  function 
and  failure  time  moments  are  derived,  an  analytical  expression  for  the  system’s 
availability  under  an  inspect-and-replace  policy  is  derived. 

An  inspect-and-replace  policy  is  a  maintenance  policy  used  in  many  real-world 
systems.  It  assumes  that  inspections  occur  at  constant  intervals  and  an  inspection 
reveals  if  the  system  is  failed  or  operating.  If  the  system  is  operating,  nothing  is  done 
until  the  next  inspection,  and  if  it  has  failed  the  system  is  replaced  with  a  new  unit. 
Replacements  are  assumed  to  be  instantaneous  and  inspections  are  assumed  to  per¬ 
fectly  diagnose  the  system’s  condition.  Deriving  a  measure  of  availability  quantihes 
the  effects  repairs  will  have  on  the  system  and  will  allow  analysts  to  compare  com¬ 
peting  inspection  policies.  The  results  for  reliability  and  availability  are  conhrmed 
by  comparing  the  analytical  solutions  to  those  obtained  via  simulation.  Finally,  the 
numerical  results  presented  in  this  thesis  demonstrate  the  broad  applicability  of  the 
models  derived  herein. 
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1.3  Thesis  Outline 


The  next  chapter  includes  a  review  of  the  previous  work  done  in  this  held.  It 
begins  with  an  overview  of  the  early  works  in  reliability  theory  and  continues  with 
a  detailed  look  at  both  wear  and  shock  models,  as  well  as  systems  that  incorporate 
both  damage  mechanisms.  Chapter  2  provides  the  reader  a  frame  of  reference  to 
understand  the  contributions  of  this  thesis.  It  also  shows  that  all  existing  methods 
are  too  cumbersome  to  implement  in  practice.  Furthermore,  this  chapter  highlights 
the  existing  gap  in  the  literature  of  compound  damage  models;  the  gap  that  this 
thesis  will  close. 

The  formal  notation  and  mathematical  model  are  developed  in  chapter  3.  In 
the  hrst  section  an  appropriate  stochastic  model  is  constructed.  Next,  the  main  re¬ 
sults  of  this  thesis  are  obtained  by  deriving  a  system  of  partial  differential  equations 
satished  by  the  probability  distribution  of  the  damage  sustained  during  a  time  inter¬ 
val.  That  system  is  then  solved  via  Laplace  transform  techniques.  Using  the  trans¬ 
form  solution,  the  failure  time  distribution  is  obtained.  The  third  section  successfully 
reduces  the  two-dimensional  result  to  a  single  dimension.  Using  the  one-dimensional 
result  the  moments  of  the  failure  time  distribution  are  derived.  This  chapter  closes 
by  deriving  an  analytical  expression  for  the  long-run  availability  of  the  system. 

Chapter  4  is  dedicated  to  numerical  examples  that  show  the  accuracy  of  the 
one-dimensional  inversion  as  compared  to  a  Monte  Carlo  simulation.  The  examples, 
drawn  from  a  variety  of  applications,  illustrate  the  means  by  which  the  results  of 
chapter  3  can  be  used  to  analyze  real-world  problems.  Most  importantly,  this  chapter 
demonstrates  the  enormous  disparity  in  computational  effort  between  the  analytical 
method  and  Monte  Carlo  simulation.  More  specihcally,  the  reliability  measures  were 
computed  in  only  a  few  seconds  using  the  analytical  results  as  compared  to  roughly 
four  hours  for  the  same  measures  via  simulation.  Moreover,  a  new  simulation  model 
(including  verihcation  and  validation)  was  needed  for  each  implementation,  whereas 
the  analytical  results  required  only  a  set  of  new  input  parameters. 
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The  final  chapter  concludes  the  thesis  by  summarizing  the  main  results  and 
the  contributions  to  the  academic  community,  the  United  States  Air  Force  and  the 
Department  of  Defense.  Also  contained  in  chapter  5  are  recommendations  for  future 
research  directions. 
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2.  Review  of  the  Literature 

In  this  chapter  the  literature  in  the  held  of  reliability  is  reviewed.  The  empha¬ 
sis  is  on  the  development  of  stochastic  models  to  analyze  systems  that  are  subject  to 
wear  and/or  random  shocks.  The  hrst  section  is  a  synopsis  of  the  important  works  in 
the  held  of  reliability  prior  to  1970.  The  literature  and  evolution  of  stochastic  shock 
and  wear  models  will  be  reviewed  in  the  second  and  third  sections,  respectively.  In 
the  fourth  section  the  literature  for  models  that  combine  the  ehects  of  random  shocks 
and  continuous  wear  is  investigated.  The  hfth  section  of  this  chapter  considers  some 
of  the  literature  concerned  with  the  use  of  stochastic  damage  models  for  deriving  an 
optimal  replacement  policy.  The  hnal  section  connects  the  literature  reviewed  to  the 
research  ehort  of  this  thesis. 

2.1  The  History  of  Reliability  Theory 

The  mathematical  study  of  reliability  has  grown  out  of  the  demand  of  modern 
technology,  and  particularly  from  the  experiences  of  World  War  II.  The  early  foun¬ 
dation  of  reliability  theory  was  in  the  actuarial  concepts  developed  in  the  insurance 
industry  [9].  While  these  concepts  evolved  during  the  1930s  into  the  study  of  struc¬ 
tural  reliability  and  fatigue  failures,  it  was  not  until  the  second  World  War  that  the 
mathematics  of  reliability  was  rigorously  studied.  The  production  of  materiel  and 
the  high  failure  rates  of  complex  combat  systems  fueled  the  rapid  growth  of  the  held 
during  the  late  1930s  and  through  the  end  of  the  war. 

The  pioneers  of  reliability  theory  focused  on  two  main  problems.  The  hrst, 
fatigue  life,  was  extensively  studied  by  Weibull  [38] ,  who  in  1939  introduced  a  prob¬ 
ability  distribution  to  describe  the  breaking  strength  of  materials.  He  also  proposed 
using  this  distribution  to  describe  system  lifetimes.  Replacement  problems  were  the 
second  area  studied  during  this  time.  The  study  of  replacement  problems  introduced 
the  concepts  of  stochastic  modelling  to  reliability.  The  mathematics  of  queueing  and 
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renewal  theory  were  used  to  solve  early  replacement  problems.  Lotka’s  work  [25]  in 
1939  used  the  theory  of  renewals  to  address  replacement  problems  in  an  industrial 
setting. 

The  focus  of  reliability  research  for  the  1950s  was  on  improving  military  and 
civilian  aviation  systems.  Many  military  systems,  such  as  radar,  depended  on  com¬ 
plex  electronic  equipment.  Highly  unreliable  vacuum  tubes  were  used  in  almost  all 
avionic  and  electronic  equipment  of  the  era.  The  dependence  of  the  aviation  industry 
on  these  notoriously  undependable  components  led  to  the  formation  of  Aeronautical 
Radio,  Inc.  (ARINC).  This  group  collected  and  analyzed  defective  vacuum  tubes, 
and  they  were  able  to  increase  the  reliability  of  numerous  types  of  tubes  [3].  ARINC 
was  the  hrst  group  formed  to  study  the  problems  of  reliability  in  a  real-world  setting. 

ARINC  signaled  a  shift  in  how  system  reliability  was  studied  and  improved. 
Previously,  an  unreliable  system  was  made  more  reliable  by  adding  redundant  com¬ 
ponents,  but  this  over-engineering  of  systems  made  them  larger,  heavier,  and  more 
complicated.  When  the  United  States  and  the  Soviet  Union  began  to  develop  ballis¬ 
tic  missiles  in  the  1950s,  the  goal  of  reliability  research  was  to  design  more  reliable 
components,  eliminating  the  need  for  redundancy.  The  methods  used  to  calculate  a 
system’s  reliability  needed  to  be  changed  as  well.  Prior  to  this  decade,  a  system’s 
reliability  was  determined  by  destructive  life  testing.  This  method  works  well  for 
simple,  inexpensive  systems,  but  the  method  is  not  acceptable  for  calculating  the 
reliability  of  expensive,  complex  items.  Scientists  needed  to  develop  more  powerful 
analytical  techniques  for  determining  a  system’s  reliability  measures  [4]. 

The  United  States  Air  Force  formed  the  ad  hoc  Group  on  Reliability  of  Elec¬ 
tronic  Equipment  in  December  of  1950.  The  group  studied  the  reliability  of  Air 
Eorce  systems  and  recommended  measures  to  increase  reliability  and  reduce  main¬ 
tenance  [3].  The  Department  of  Defense  followed  the  example  of  the  Air  Eorce  and 
established  the  Advisory  Group  on  Reliability  of  Electronic  Equipment  (AGREE) 
in  1952.  In  1957  AGREE  published  a  report  entitled  “Reliability  Versus  the  Gost 
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of  Failure”  [32].  This  report  evaluated  the  life  cycle  costs  of  holding  more  reliable 
systems;  it  also  included  acceptance  limits  and  reliability  requirements. 

The  classic  view  reliability  was  to  consider  failures  as  random  occurrences. 
These  occurrences  are  not  actually  random,  but  instead  are  caused  by  the  physical 
and  chemical  interactions  of  the  system.  Because  the  conditions  that  cause  the 
failures  are  not  always  understood,  the  failure  pattern  is  modelled  with  a  probability 
distribution  [9].  Epstein  and  Sobel  [11]  began  investigating  various  distributions  used 
to  model  failure  processes  in  the  1950s.  After  discussions  with  electronic  experts  they 
concluded  that  they  should  focus  on  a  non-normal  distribution  with  the  form 

=  e>0,x>0.  (2.1) 

Their  work  began  the  widespread  practice  of  assuming  failure  times  to  be  exponen¬ 
tially  distributed. 

The  use  of  the  exponential  distribution  to  model  random  failure  times  was  also 
studied  by  Davis  [7].  Davis  accumulated  the  results  of  various  lifetime  tests  and  ht 
different  probability  distributions  to  the  data.  He  then  performed  several  goodness- 
of-£t  tests  to  measure  the  ability  of  competing  distributions  to  model  the  failure 
data,  and  found  the  exponential  distribution  to  be  the  best.  His  paper  is  often  cited 
by  subsequent  authors  who  assume  failure  times  to  be  exponential.  The  publication 
of  [7]  and  [11]  made  the  exponential  distribution  the  probability  distribution  of  choice 
for  reliability  research. 

The  exponential  distribution  gained  popularity  for  another  reason:  it  has  prop¬ 
erties  that  simplify  the  reliability  analysis  of  many  systems.  Complex  systems,  like 
airplanes  and  cars,  are  composed  of  many  subsystems.  If  the  failure  times  of  these 
subsystems  are  distributed  exponentially,  then  their  failure  rates  can  be  summed  to 
find  the  failure  rate  of  the  composite  system.  The  exponential  distribution  also  has 
the  memoryless,  or  Markovian,  property.  That  is,  the  future  lifetime  of  a  compo- 
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nent  does  not  depend  on  how  long  it  has  already  operated.  This  property  is  very 
important  when  using  a  renewal  theory  approach  because  exponential  failure  times 
simplify  the  analysis  of  many  systems. 

The  missile  systems  and  communication  networks  of  the  1960s  presented  new 
challenges  to  reliability  researchers.  The  common  practice  of  assuming  an  expo¬ 
nential  distribution  did  not  accurately  model  their  system  reliability.  The  use  of 
semi-Markov  processes  to  compute  system  reliability  measures  was  introduced  by 
Weiss  [39]  in  1956,  and  two  years  later  Smith  [36]  published  a  cogent  summation  of 
the  known  mathematical  results  in  renewal  theory.  The  limiting  probability  tech¬ 
niques  of  queuing  theory  were  used  at  the  “Moscow  School”  of  reliability  to  develop 
maintenance  and  repair  models  [3] .  These  works  helped  bridge  the  gap  between  the 
helds  of  stochastic  modelling  and  reliability. 

The  emergence  of  nuclear  power  plants  and  the  subsequent  study  of  their  reli¬ 
ability  directed  most  research  in  the  held  to  fault  tree  analysis  [3]  during  the  1970s. 
Haasl  [17]  provides  the  best  description  of  how  to  construct  a  fault  tree  in  an  engi¬ 
neering  setting  [3].  However,  fault  trees  of  complicated  systems  needed  more  power¬ 
ful  algorithms  to  produce  signihcant  reliability  results.  Research  into  Boolean  and 
set  theoretic  combinatorial  methods  produced  the  results  needed  to  allow  reliability 
engineers  to  design  safe  and  effective  nuclear  power  plants  [3]. 

Finally,  numerous  important  shock  and  wear  models  were  developed  from  the 
1970s  through  the  present.  The  distinguishing  feature  of  these  models  is  that  they 
are  derived  by  considering  stochastic  processes  that  describe  the  failure  generating 
mechanisms  [34].  The  dynamic  environments  in  which  the  systems  operate  are  not 
modelled  with  classical  techniques,  and  stochastic  models  provide  the  flexibility  to 
describe  the  failure  processes.  These  models  are  of  great  importance  to  this  thesis 
research,  and  the  associated  literature  is  reviewed  in  detail  in  the  next  three  sections. 
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2.2  Shock  Models 


Shock  models  are  used  to  analyze  special  stochastic  point  processes.  A  stochas¬ 
tic  point  process  is  characterized  by  isolated  events  occurring  randomly  over  some 
domain,  normally  time  [9].  When  the  random  time  between  events  forms  a  sequence 
of  independent  and  identically  distributed  random  variables,  the  stochastic  point 
process  is  called  a  renewal  process.  The  theory  of  renewals  is  a  powerful  tool  used 
to  analyze  the  reliability  of  many  systems  that  are  subject  to  damage  caused  by 
random  shocks. 

Shock  models  can  generally  be  described  as  follows.  A  system  begins  operating 
at  time  t  =  0  in  perfect  condition,  and  after  a  random  amount  of  time  ri  the 
system  experiences  a  shock.  When  the  hrst  shock  occurs  the  system  incurs  some 
degree  of  damage  Yi-  The  magnitude  of  the  shock  may  be  a  random  variable  or  a 
deterministic  value.  Since  the  latter  is  a  special  case  of  the  former,  we  will  assume 
Yi  is  a  random  variable.  The  system  continues  to  operate,  and  after  another  random 
amount  of  time  T2,  a  second  shock  occurs  whose  magnitude  is  given  by  the  random 
variable  Y2.  The  random  variables  r*  and  Tj,i  7^  j  are  stochastically  equivalent,  and 
the  shock  arrival  process  forms  a  renewal  process.  The  shocks  continue  until  the 
stopping  criterion  is  met,  at  which  time  the  system  is  considered  to  have  failed.  The 
failed  system  is  replaced  with  an  identical  unit,  and  the  process  renews.  Because 
the  replacement  unit  is  identical  to  the  original,  the  random  time  until  it  fails  is 
stochastically  equivalent  to  the  original  failure  time,  and  another  renewal  process  is 
formed. 

The  literature  in  the  area  of  shock  models  is  concerned  with  two  different 
failure  mechanisms.  The  hrst  is  the  cumulative  damage  shock  model.  In  this  model, 
the  damage  sustained  on  the  ith  shock  is  a  random  variable,  whose  distribution  TV(-) 
is  known,  and  the  total  damage  the  system  has  experienced  up  to  time  t  is  a  random 
variable  D{t).  The  random  variable  D{t)  depends  on  the  number  of  shocks  that 
occur  up  to  time  f,  and  the  magnitude  of  those  shocks.  The  number  of  shocks  that 
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occur  before  time  t  is  the  random  variable  N{t),  and  the  total  damage  is 

Nit) 

(2.2) 

i=l 

In  the  cumulative  damage  model  the  system  will  fail  when  the  total  damage 
exceeds  a  threshold  value  x,  and  the  time  of  failure  is  a  random  variable  given  by 

Tx  =  inf{t  ;  D{t)  >  x}.  (2.3) 

The  cumulative  shock  model  is  very  useful  when  analyzing  systems  whose  deterio¬ 
ration  depends  on  the  total  effect  of  the  shocks  over  time. 

The  second  shock  model  found  in  the  literature  is  a  maximum  shock  model. 
Shocks  occur  according  to  a  renewal  process,  and  the  ith  shock  causes  a  random 
amount  of  damage  1^.  However,  the  system  will  function  until  the  magnitude  of 
a  single  shock  exceeds  the  threshold  value  x.  The  random  variable  Si  denotes  the 
occurrence  time  of  the  ith  shock,  and  the  dependence  of  the  random  failure  time 
on  the  number  of  shocks  and  their  magnitude  can  be  characterized  as 

Tx  =  intis'*  :  maxT)  >  x}.  (2-4) 

This  model  is  a  special  case  of  the  cumulative  shock  model  and  is  used  to  compute  the 
reliability  of  “loaded”  systems,  where  a  system  experiences  random  loads  and  fails 
when  a  single  load  exceeds  the  system’s  capacity.  The  modern  body  of  literature 
on  shock-based  failure  models  stems  from  the  seminal  paper  of  Esary  et  ah  [12] 
published  in  1973. 

Esary  et  ah  [12]  consider  a  system  subject  to  random  shocks  that  occur  ac¬ 
cording  to  a  Poisson  process.  Their  paper  deserves  special  attention  because  it  is 
the  hrst  to  examine  the  effects  of  random  shocks  and  wear.  The  important  shock 
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model  results  of  [12]  are  reviewed  here  and  the  results  for  wear  models  are  addressed 
in  a  subsequent  section.  The  authors  dehne  the  probability  Pk  as  the  probability 
the  system  survives  the  hrst  k  =  1,  2, ...  shocks.  The  probability  the  system  survives 
beyond  time  t  is  then  given  by 

(2.5) 

k=0 

The  authors  demonstrate  that  the  various  properties  of  the  discrete  failure  time 
distribution  Pk  are  reflected  in  the  corresponding  properties  of  the  continuous  life 
distribution  H{t).  They  then  investigate  the  properties  of  various  models  including 
the  cumulative  damage  and  maximum  shock  models.  Finally,  the  authors  investigate 
the  properties  of  a  shock  model  whose  damage  threshold  may  be  modelled  as  a 
random  variable.  This  model  is  appropriate  to  describe  systems  in  which  there  is 
signihcant  individual  variation  in  a  unit’s  ability  to  withstand  damage. 

Rade  [31]  introduced  a  new  model,  the  parallel  shock  model.  A  parallel  shock 
model  is  used  to  describe  a  system  having  n  identical  components,  with  shocks 
occurring  according  to  a  Poisson  renewal  process.  The  hrst  case  Rade  [31]  stud¬ 
ies  is  as  follows.  When  a  shock  occurs,  each  component  will  fail  with  probability 
p,  independent  of  all  other  components.  A  simple  closed-form  expression  for  the 
Laplace-Stieltjes  transform  of  the  failure  time  distribution  and  the  expected  time 
until  system  failure  is  derived. 

Rade  [31]  next  considers  the  case  in  which  shocks  occur  in  the  same  manner, 
bnt  the  probability  of  component  failnre  depends  on  the  number  of  functioning 
components.  This  is  a  more  realistic  assnmption  for  load-sharing  systems.  In  the 
second  model,  the  chance  a  single  component  fails  is  pk,  where  k  is  the  number 
of  functioning  components.  Rade’s  results  [31]  for  the  load  sharing  model  were 
derived  using  elementary  probability  argnments  and  by  conditioning  on  the  nnmber 
of  components  that  fail  at  the  time  of  the  first  shock  occurence.. 


2-7 


Esary  et  al.  [12]  and  Rade  [31]  considered  systems  for  which  the  time  between 
shocks  is  independent.  Shanthikumar  and  Sumita  [33]  examined  the  case  when 
shocks  are  correlated  with  the  renewal  process.  Their  paper  considers  a  maximum 
shock  model  associated  with  a  correlated  pair  of  random  variables  X„  and  Yn,  where 
Xn  is  the  magnitude  of  the  nth  shock  and  Yn  is  the  time  between  the  nth  and  (n— l)st 
shock.  The  authors  developed  two  related  models.  In  the  hrst  model  the  magnitude 
of  the  nth  shock  depends  on  the  length  of  the  interval  since  the  last  shock,  and  in 
the  second  model  the  magnitude  of  the  nth  shock  depends  on  the  time  until  the  next 
shock.  The  authors  analyze  a  sequence  of  independently  and  identically  distributed 
random  variables  (X„,  Yn),  n  =  0, 1,  2, ...  with  common  joint  distribution  function. 
The  variates  (X„,  Yn)  is  independent  pairwise,  but  Xn  and  Yn  may  be  correlated  for 
a  given  n;  thus,  (X„,  Yn)  are  called  a  correlated  pair  of  renewal  sequences. 

When  Xn  and  W  are  independent,  and  Yn  are  identically  and  exponentially 
distributed,  the  general  model  simplihes  to  the  Poisson  shock  model  of  Rade  [31], 
but  the  Poisson  shock  model  is  not  appropriate  if  Xn  and  Yn  are  correlated.  There 
are  many  such  examples  in  the  real  world.  A  stochastic  clearing  system  produces 
inventory,  and  the  inventory  grows  in  quantity  over  time.  The  orders  are  hlled  at 
random  times,  and  the  random  time  between  shipments  is  an  independent,  identically 
distributed  random  variable  with  some  general  distribution.  The  shipments  form  a 
renewal  process  and  can  be  viewed  as  shocks.  The  amount  of  cleared  product  will 
depend  on  the  time  since  the  last  shipment.  Shanthikumar  and  Sumita  [33]  develop 
a  transform  result,  an  exponential  limit  theorem,  and  properties  of  the  associated 
renewal  process  of  the  failure  times. 

In  a  subsequent  paper,  Sumita  and  Shanthikumar  [37]  revisit  this  model,  and 
incorporate  a  cumulative  shock  model.  Sumita  and  Shanthikumar  [37]  derive  similar 
transform  results  as  in  their  previous  paper,  as  well  as  asymptotic  properties  of  the 
system  failure  time. 
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Igaki  et  al.  [18]  extend  Shanthikumar  and  Sumita  [33]  by  incorporating  the 
influence  of  an  external  system.  This  philosophy  allows  the  shock  process  to  be 
driven  by  another  independent  process.  This  is  the  first  state- dependent  shoek  model 
and  extends  the  work  Qinlar  [6]  did  on  state-dependent  wear  processes.  In  [18]  the 
system  is  assumed  to  occupy  some  state  i  G  5  =  {1,2,  for  an  exponentially 

distributed  random  amount  of  time,  and  then  a  shock  occurs.  When  the  external 
system  is  in  state  i,  shocks  occur  with  rate  A*.  The  magnitude  of  the  shock  is  a 
random  variable  whose  distribution  function  is  Fi{-).  When  a  shock  occurs,  the 
system  transitions  to  state  j  E  S,i  ^  j . 

In  the  model  by  Igaki  et  al.  [18]  the  system  changes  state  after  each  shock 
according  to  a  continuous-time  Markov  chain  (CTMC).  The  joint  distribution  of 
{Xn,  Yn)  then  depends  on  the  transitions  of  the  underlying  CTMC.  The  failure  time 
distribution  was  derived  as  a  Laplace  transform  solution  for  both  the  cumulative 
and  maximum  shock  models.  Second,  expressions  for  the  expected  value,  second 
moment,  and  variance  are  found  for  each  model.  Finally,  the  authors  turn  their 
attention  to  the  limiting  behavior  by  proving  a  theorem  that  gives  the  limit  of  the 
failure  time  distribution  a.s  t  ^  oo. 

More  recently,  Skoilakis  [35]  generalized  the  results  of  Rade  [31]  and  Nak- 
agawa’s  [27].  In  the  original  models,  a  system  has  parallel  components  that  are 
subject  to  shocks  according  to  a  standard  renewal  process.  Each  time  a  shock  oc¬ 
curs,  the  components  fail  independently  with  probability  equal  to  the  magnitude  of 
the  shock.  The  shock  intensity  distribution  is  temporally  homogenous.  Skoilakis  [35] 
claims  these  models  are  not  realistic  and  extends  them  by  allowing  the  magnitudes  of 
the  shocks  to  change  with  time.  For  example,  when  a  system  runs  for  some  length  of 
time  it  accumulates  damage.  This  damage  can  be  thought  of  as  wear,  and  a  system 
that  is  wearing  out  will  be  more  susceptible  to  shocks.  This  can  be  modelled  by 
allowing  the  shocks  to  increase  in  magnitude  and  intensity  as  time  passes.  Skoilakis’ 
model  accounts  for  system  wear  out  or  system  repairs. 
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If  the  expected  value  of  the  inter-shock  times  is  hnite,  and  the  jth  shock, 
independent  of  everything  else,  has  a  random  magnitude  x  G  [0, 1],  with  distribution 
Gj  then  the  author  derives  an  expression  for  the  mean  failure  time  and  the  Laplace- 
Stieltjes  transform  of  the  failure  time  distribution.  His  second  result  states  that 
if  the  inter-shock  time  distribution  is  absolutely  continuous,  then  the  failure  time 
distribution  is  also  absolutely  continuous. 

Ebrahimi  [10]  developed  a  technique  for  the  comparison  of  different  cumulative 
shock  models.  He  assumes  shocks  occur  according  to  a  Poisson  process,  and  each 
shock  causes  a  random  amount  of  damage.  As  with  all  cumulative  shock  models, 
the  system  fails  when  the  total  damage  caused  by  the  shocks  exceeds  a  threshold  x. 
He  derives  sufficient  conditions  for  the  failure  rate  order  and  stochastic  order  to  hold 
between  the  random  lifetimes  of  two  systems  whose  damage  can  be  described  as  a 
cumulative  Poisson  shock  model.  His  results  are  important  because  they  allow  the 
direct  comparison  of  two  random  processes  via  stochastic  ordering. 

The  literature  reviewed  thus  far  has  focused  on  derivation  of  the  failure  time 
distributions  and  their  associated  asymptotic  properties.  Gottlieb  [15]  does  not 
consider  a  specific  damage  process;  instead  he  assumes  the  device  is  more  likely  to 
fail  as  the  total  damage  increases.  Gottlieb  [15]  developed  sufficient  conditions  for 
both  the  damage  process  and  the  device’s  ability  to  survive  damage  that  guarantee 
the  lifetime  distribution  has  an  increasing  failure  rate  (IFR).  He  next  identifies  classes 
of  stochastic  damage  processes  that  satisfy  these  conditions. 

Shock  models  are  useful  tools  to  describe  the  failure  mechanism  of  many  sys¬ 
tems,  but  they  cannot  accurately  describe  the  reliability  of  systems  that  are  subject 
to  continuous  wear.  In  the  next  section  the  relevant  wear  models  and  their  associated 
properties  are  reviewed. 
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2.3  Wear  Models 


Shock  models  assume  damage  to  a  system  is  sustained  at  discrete  times,  but 
this  assumption  is  too  restrictive  for  systems  that  constantly  experience  wear.  Early 
wear  models  used  classic  shock  models  and  examined  the  results  when  the  shocks 
occurred  continuously.  More  sophisticated  wear  models  were  quickly  developed  to 
analyze  reliability  measures  of  systems  subject  to  continuous  wear.  A  general  state- 
dependent  wear  model  assumes  a  system  accumulates  damage  at  a  rate  r(i),  i  G  S  = 
{1,2, ...,  A'},  which  depends  on  the  state  the  system  occupies.  The  system  changes 
states  according  to  some  semi-Markov  process  until  some  time  when  the  total  amount 
of  damage  exceeds  a  threshold  x,  and  the  system  ceases  to  function.  Esary  et  ah 
[12]  defined  a  general  wear  process  {Z(t)  :  t  >  0}  such  that 

1.  Z(0)  =  0,  and  Z{t  At)  —  Z{t)  >  0  for  all  t,  At  >  0  with  probability  1, 

2.  {Z{t)  :  t  >  0}  is  a  Markov  Process,  and 

3.  P{Z{t  At)  —  Z{t)  <  u\Z{t)  =  z}  is  decreasing  in  2:  and  t  for  t,z,At  >  0. 

They  prove  a  system  subject  to  continuous  wear  has  an  increasing  hazard  rate  av¬ 
erage.  The  proof  is  accomplished  by  foregoing  the  assumption  of  discrete  damage 
occurring  at  isolated  moments,  and  instead  investigating  the  properties  of  a  system 
experiencing  continuous  shocks.  Other  results  for  similar  wear  models  exist  in  [26] 
and  [5].  Wear  processes  like  the  one  described  in  [12]  are  insufficient  to  characterize 
many  real  systems,  and  a  recent  trend  has  been  the  development  of  failure  models 
that  consider  two  processes:  one  for  the  traditional  wear  process  and  a  second  for 
the  ambient  process  that  drives  the  wear  process. 

Qinlar  [5]  offers  a  rigorous  mathematical  way  to  describe  this  failure  mecha¬ 
nism  by  a  Markov  additive  process  (MAP).  A  MAP  is  a  bivariate  stochastic  process 
{{Z{t),X{t))  :  t  >  0}.  The  Z{t)  process  is  an  independent  CTMC,  and  the  X{t) 
process  is  an  additive  functional  of  the  Erst.  Using  a  MAP,  Qinlar  [6]  derives  results 
for  shock  and  wear  models  with  this  unique  structure.  He  assumes  stationary  inde- 
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pendent  increments  of  the  damage  process,  meaning  that  the  system  is  temporally 
homogeneous,  and  the  damage  rate  depends  only  on  the  state  of  the  system.  Under 
this  assumption,  the  author  derives  the  failure  time  distribution,  where  the  damage 
threshold  is  a  random  variable.  Also,  he  obtains  the  hazard  rate  function  for  the 
time  to  failure  of  a  multiplicative  killing  type  failure  mechanism.  Finally,  Qinlar  [6] 
shows  that,  given  a  gamma  process  with  a  shape  parameter  that  is  a  function  of 
Brownian  motion,  the  resulting  lifetime  distribution  is  Weibull. 

Singpurwalla  [34]  offers  a  comprehensive  review  of  the  results  obtained  for 
models  whose  failures  depend  on  a  dynamic  environment.  In  his  work,  the  author 
examines  the  four  strategies  developed  to  address  these  systems.  The  first  strategy 
describes  the  wear  of  a  system  as  a  diffusion  process.  The  second  strategy  models  the 
damage  mechanism  caused  by  a  shock-inflicting  process.  Third,  a  response  variable 
that  is  strongly  correlated  with  the  system’s  lifetime  is  modelled  using  a  stochastic 
process.  Finally,  a  covariate  process,  like  the  MAP  due  to  Qinlar’s  [5],  is  used  to  link 
the  damage  process  to  the  underlying  excitation  process. 

Many  wear  models  exist,  but  by  Qinlar’s  [6]  own  admission  they  are  not  easily 
implemented.  Much  of  the  recent  literature  on  stochastic  shock  and  wear  processes 
focuses  on  deriving  more  readily  implementable  results.  Kharoufeh  [19]  uses  a  MAP 
to  model  a  system  that  is  subject  to  continuous,  state-dependent,  linear  wear.  His 
main  result  is  a  compact  transform  expression  for  the  failure  time  distribution.  His 
results  are  easier  to  implement  numerically  than  previous  works  in  the  held.  The 
current  thesis  research  effort  is  an  extension  of  his  model  and  uses  similar  analysis 
techniques. 

Kharoufeh  [19]  examined  a  single-unit  that  accumulates  damage  via  a  contin¬ 
uous  wear  process,  {X(t)  :  t  >  0}.  The  system  begins  in  perfect  working  condition 
and  accrues  wear  at  various  rates  which  depend  on  an  external  random  environ¬ 
ment.  The  random  environment  is  modelled  as  a  continuous  time  stochastic  process 
{Z{t)  :  t  >  0},  and  when  the  ambient  process  is  in  state  i  G  S'  =  {1,2, ...,  A'},  the 
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system  accumulates  damage  at  a  linear  rate  r{i).  The  system  continues  to  function 
until  the  instant  when  the  accumulated  damage  reaches  some  threshold  x.  The  wear 
process  is  assumed  to  be  temporally  homogeneous,  and  the  environmental  process 
has  a  hnite  state  space. 

Under  these  assumptions  the  author  developed  a  Laplace-Stieltjes  expression 
for  the  failure  time  distribution.  Using  the  relationship  between  the  total  wear 
accumulated  at  time  t,  X{t),  and  the  random  time  until  failure,  T^,  he  derived  the 
failure  time  distribution  by  solving  a  system  of  hrst  order,  linear  partial  differential 
equations.  The  moments  of  the  failure  time  distribution  were  also  derived.  This 
thesis  extends  his  analysis  by  superimposing  random  shocks  on  the  wear  model. 
To  that  end,  the  literature  for  systems  subject  to  compound  damage  processes  is 
reviewed  in  the  next  section. 

2.4  Compound  Damage  models 

This  thesis  will  extend  the  results  of  Kharoufeh  [19]  by  incorporating  random 
shocks  into  the  state-dependent  wear  process.  Currently,  there  are  few  models  that 
incorporate  the  effects  of  both  shocks  and  wear.  Most  of  these  models  allow  for  shocks 
to  occur  only  at  transition  epochs.  Qinlar’s  model  [6]  is  aimed  at  studying  systems 
that  are  subject  to  continuous  wear  and  shocks.  He  assumes  shocks  occur  often, 
but  the  magnitude  of  individual  shocks  is  very  small.  Dehne  a  compound  Poisson 
process  as  one  in  which  shocks  occur  according  to  a  Poisson  process  with  rate  A, 
and  the  magnitudes  are  identically  and  independently  distributed  random  variables 
with  distribution  0.  Next,  he  dehnes  a  gamma  process  as  one  in  which  the  damage 
process  has  the  gamma  density.  Finally,  a  compound  Poisson  process  in  a  random 
environment  is  one  in  which  shocks  occur  with  some  rate  Xk,  depending  on  the  state 
of  the  system,  k.  Qinlar  [6]  derives  the  failure  time  distribution  and  associated 
properties  for  these  models.  His  results,  however,  are  not  easily  implemented. 
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Several  researchers  have  examined  the  effect  maintenance  has  on  systems  whose 
reliability  is  a  function  of  its  operating  environment.  Klutke  et.  ah  [22]  examined  the 
availability  of  an  inspected  system  whose  inter-inspection  times  and  wear  rates  were 
random.  A  subsequent  paper  by  Klutke  and  Yang  [23]  derives  availability  results 
for  a  system  subject  to  constant  degradation,  shocks,  and  a  deterministic  inspection 
policy.  Kiessler  et.  al.  [21]  studied  a  system  with  soft  failures,  whose  wear  rate 
depended  explicitly  on  a  continuous-time  Markov  chain.  These  models  encompass 
the  effect  a  systems  operating  environment  has  on  its  availability,  but  they  do  not 
allow  for  the  system  to  simultaneously  experience  state-dependent  wear  and  random 
shocks. 

2.5  Optimal  Replacement  Literature 

One  of  the  most  significant  applications  of  shock  models  is  the  development  of 
an  optimal  maintenance  policy.  In  general,  an  optimal  replacement  policy  balances 
the  cost  of  preventative  replacement  with  the  higher  cost  of  replacing  a  failed  sys¬ 
tem.  An  optimal  policy  minimizes  an  associated  cost  function  over  a  given  planning 
horizon. 

Nakagawa  [27]  found  an  optimal  replacement  policy  for  Rade’s  [31]  model. 
Recall  that  Rade’s  model  was  an  n  parallel  system  that  was  subject  to  shocks  ac¬ 
cording  to  a  Poisson  process.  Each  component  independently  has  a  probability  of 
failure  p.  The  probability  of  failure  is  determined  by  the  magnitude  of  the  shock. 
The  magnitude  of  the  shock  is  a  random  variable  with  a  known  probability  distri¬ 
bution.  Nakagawa  [27]  considered  the  following  replacement  problem.  The  entire 
system  is  exchanged  preventively  before  system  failure  if  the  total  number  of  failed 
components  is  greater  then  some  number  k.  If  the  system  fails,  it  is  replaced  at 
the  time  of  failure.  A  cost  cp  occurs  when  the  system  fails  and  is  replaced.  A  cost 
cp  is  incurred  when  the  system  is  replaced  prior  to  failure.  This  problem  is  only 
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interesting  if  cp  >  cp.  Nakagawa  [27]  derives  an  expression  for  k  that  minimizes  the 
expected  cost  per  unit  time  in  the  long  run. 

Nakagawa’s  [27]  results  were  valid  for  Rade’s  [31]  model;  however,  there  are 
many  generalizations  of  this  optimization  problem.  Feldman  [13]  considered  a  more 
general  semi-Markov  shock  model.  In  that  work  a  single-unit  system  is  subject  to 
randomly  occurring  shocks.  The  magnitude  of  the  shocks  is  also  random.  The  system 
will  fail  when  the  total  accrued  damage  exceeds  some  threshold.  The  probability  of 
failure  is  a  function  of  the  total  damage  done  by  previous  shocks.  This  is  said  to  be 
a  semi-Markov  process  because  both  the  time  between  shocks  and  the  damage  done 
depend  only  on  the  current  damage  level. 

Using  this  framework  Feldman  [13]  first  developed  a  cost  function  T  to  be  min¬ 
imized.  To  that  end,  he  used  a  Markov  renewal  argument  to  derive  the  replacement 
time  distribution  and  the  expected  replacement  time.  Next,  he  turns  his  atten¬ 
tion  to  the  derivative  of  the  cost  function.  Setting  the  derivative  equal  to  zero  and 
simplifying  provides  the  optimal  result. 

In  a  later  paper,  Feldman  [14]  hnds  an  optimal  replacement  policy  for  a  system 
that  is  subject  to  continuous  shocks.  His  hrst  paper  assumed  that  a  hnite  number  of 
shocks  occur  in  a  hnite  time  span.  In  the  later  work  he  relaxes  this  assumption  and 
allows  for  a  collection  of  shock  times  that  form  a  random  set  containing  no  isolated 
points.  Feldman  [14]  uses  an  embedded  MAP  to  solve  the  optimal  replacement 
problem.  He  develops  the  theory  for  the  optimal  stopping  of  a  MAP.  Using  the 
optimal  stopping  theory,  he  obtains  an  optimal  replacement  policy. 

Gottlieb  [16]  creates  a  more  encompassing  model  and  hnds  an  optimal  replace¬ 
ment  policy.  In  his  model  there  is  no  assumption  about  the  monotonicity  of  the 
failure  rate  (previous  models  required  non-decreasing  failure  rates).  Gottlieb  [16] 
also  allows  for  replacement  at  any  time,  not  just  immediately  after  shocks.  Under 
these  conditions  he  hnds  the  optimal  replacement  policy  is  to  replace  when  the  item 
reaches  a  threshold  which  depends  on  the  state  after  the  last  jump. 
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Posner  and  Zuckerman  [30]  extend  all  of  the  previous  optimal  replacement 
problems  by  examining  them  as  control  limit  policies  and  applying  the  theory  of 
Markov  decision  processes.  A  control  limit  policy  is  a  replacement  rule  in  which  a 
system  is  replaced  on  failure  or  when  the  damage  level  exceeds  some  threshold  value 
X.  Treating  the  problem  in  this  way  allows  the  authors  to  examine  the  structure 
of  the  optimal  replacement  policy  and  specify  sufficient  conditions  under  which  the 
optimal  policy  possesses  the  control  limit  property.  The  control  limit  property  allows 
the  user  to  hnd  a  threshold  value  x  that  triggers  a  replacement.  Using  well-known 
methods  of  Markov  decision  processes,  the  optimal  value  of  x  was  derived. 

2. 6  Summary 

The  study  of  reliability  has  evolved  during  the  last  century  from  a  minor  appli¬ 
cation  of  probability  to  its  own  mathematical  discipline.  Modern  reliability  theory  is 
a  vast  held  incorporating  classical  probability  and  renewal  theory.  The  brief  review 
of  reliability  theory’s  development  in  section  2.1  highlighted  the  unique  way  this 
science  has  evolved  over  the  past  century.  Today’s  systems  are  very  complex,  and 
the  previous  methods  of  analyzing  reliability  measures  became  obsolete.  Researchers 
began  to  incorporate  the  techniques  of  stochastic  modelling  to  help  keep  pace  with 
the  complexity  of  the  systems  they  were  analyzing. 

The  development  of  stochastic  models  provided  researchers  with  powerful  tools 
to  analyze  the  behavior  of  systems  whose  failure  mechanisms  may  be  characterized 
by  random  processes.  The  relevant  shock  and  wear  models  were  reviewed  to  develop 
a  historical  perspective  on  the  current  research.  Although  these  models  only  consider 
a  single  damage  mechanism,  understanding  the  analysis  used  to  develop  the  results 
is  crucial  in  solving  the  current  problem. 

Recent  research  in  this  field  has  aimed  at  modelling  the  damage  mechanism 
as  a  function  of  some  other  underlying  stochastic  process.  This  technique  allows 
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researchers  to  more  accurately  model  the  failure  dynamics  of  complex  systems,  and 
will  be  used  in  developing  the  analysis  of  this  thesis.  A  model  that  considers  a  sys¬ 
tem’s  reliability  without  considering  the  characteristics  of  its  operating  environment 
may  lead  to  an  incorrect  analysis  of  the  system’s  reliability.  One  technique  used  to 
capture  the  dependence  of  a  wear  process  on  some  external  excitation  process  is  the 
use  of  a  Markov  additive  process  to  model  the  failure  mechanism. 

Currently,  there  are  a  limited  number  of  models  that  effectively  describe  sys¬ 
tems  that  are  simultaneously  subject  to  random  shocks  and  wear.  The  salient  models 
have  been  reviewed  in  section  2.4,  and  it  is  worth  noting  that  the  results  that  do  exist 
in  the  literature  are  not  easily  implemented.  This  thesis  will  partially  close  this  gap 
in  the  current  literature  by  investigating  a  system  that  is  subject  to  continuous  state- 
dependent  wear  and  random  shocks,  using  a  stationary,  bivariate  stochastic  process. 
The  ultimate  goal  of  this  research  is  to  contribute  a  tractable  solution  for  the  failure 
time  distribution,  moments,  and  long-run  availability  of  a  single-unit  system  subject 
to  this  type  of  failure  mechanism. 
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3.  Formal  Model  Description 

The  main  analytical  results  of  this  thesis  are  derived  in  this  chapter.  The  hrst 
section  describes  the  mathematical  model  and  notation  used  throughout.  The  failure 
time  distribution  is  derived  as  a  two-dimensional  transform  in  the  second  section. 
That  solution  is  reduced  to  a  one-dimensional  transform  in  the  third  section.  The 
one-dimensional  analytical  solution  is  used  in  section  3.4  to  derive  the  moments  of  the 
failure  time  distribution.  Finally,  this  chapter  concludes  by  investigating  the  long- 
run  availability  of  a  single-unit  system  under  an  inspect-and-replace  maintenance 
policy. 

3. 1  Mathematical  Model 

Consider  a  single-unit  system  that  incurs  damage  due  to  state-dependent,  con¬ 
tinuous,  linear  wear  and  random  shocks.  These  two  damage  mechanisms  are  inde¬ 
pendent  stochastic  processes.  The  continuous  state-dependent  wear  process  and  its 
unique  properties  are  hrst  examined.  A  state-dependent  wear  process  depends  on 
an  external  environmental  stochastic  process.  This  process  is  intended  to  model  the 
normal  operating  conditions  of  the  system  under  consideration.  It  is  assumed  that 
the  unit’s  normal  operating  environment  has  a  hnite  number  of  states.  The  elements 
of  the  set  S  =  {1,2,. . .  ,K},  {K  >  2  is  an  integer)  corresponding  to  the  states  in 
which  the  system  may  operate.  Without  loss  of  generality,  assume  the  environment 
begins  in  state  i  at  time  t  =  0,  and  remains  in  state  i  E  S  for  a  random  amount  of 
time,  called  the  sojourn  time.  Then  the  environment  instantaneously  transitions  to 
state  j,j  ^  i,j  E  S  and  remains  there  for  a  random  amount  of  time  before  transi¬ 
tioning  to  another  state  in  S.  The  system  continues  to  evolve  randomly  over  time 
forming  a  hnite  state-space  stochastic  process. 

The  random  variable  Z (t)  denotes  the  state  of  the  ambient  environmental  pro¬ 
cess  at  time  t,  and  it  assumes  values  in  the  state  space  S  =  (1, 2, ...,  A'}.  The 
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sojourn  times  in  state  i  are  independent  and  identically  distributed  exponential  ran¬ 
dom  variables  with  rate  G  S.  Therefore,  {Z{t)  :  t  >  0}  is  a  special  stochastic 
process,  called  a  hnite  state-space  continuous-time  Markov  chain  (CTMC).  Formally, 
{Z(t)  :  t  >  0}  characterizes  the  evolution  of  the  ambient  environment  for  all  t  >  0. 
Furthermore,  the  probability  of  transitioning  from  state  i  G  S'  to  state  j  G  S'  at  time 
t  is  assumed  to  be  time  homogeneous  and  is  given  by 

p,,{t)  =  P{Z{t)=j\Z{0)=t}. 

The  Markovian  property  of  a  CTMC  guarantees  that  the  probability  the  system  will 
next  transition  to  state  j  depends  only  on  the  current  state  of  the  system.  Kulkarni 
[24]  provides  a  cogent  discussion  of  CTMCs  and  their  important  properties. 

The  system  experiences  wear  at  a  linear  rate  that  depends  on  the  state  of  the 
ambient  process.  Let  R{t)  be  dehned  as  the  wear  rate  of  the  system  at  time  t  and 
dehne  a  positive  function  r  :  5'  — >  M"*",  where  M"*"  is  the  positive  real  line.  The 
wear-rate  process  {R{t)  :  t  >  0}  depends  explicitly  on  the  surrounding  environment 
and  assumes  values  in  the  set  V  =  {r(l), . . .  ,r{K)}.  When  the  system  is  in  state  i 
{Z{t)  =  i)  it  experiences  wear  at  rate  r{Z{t))  =  r{i).  Define  the  K  x  K  matrix 
such  that 

{r(i),  i  =  j 

■ 

0,  otherwise 

Next,  dehne  the  random  variable  W(t)  to  be  the  total  accumulated  wear  up  to 
time  t,  so  that 

W{t)  =  [  r{Z{u))du. 

Jo 

The  process  {W{t)  :  t  >  0}  is  referred  to  as  a  state-dependent  wear  process 
and  is  illustrated  in  Figure  3.1.  The  random  variable  is  the  time  at  which  the 
magnitude  of  the  accumulated  wear  exceeds  the  hxed  threshold  value  x.  Formally 
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defined,  the  lifetime  of  a  system  subject  to  continuous  wear  only  is  given  by 

Tx  =  inf{t  ■.W{t)>  x}. 

State-dependent  wear  processes  have  been  studied  extensively  ([12],  [6],  [26],  [19]). 
In  particular,  Kharoufeh  [19]  derived  analytical  expressions  for  the  failure  time  dis¬ 
tribution  and  moments  for  such  a  system. 


Figure  3.1  Sample  paths  of  cumulative  wear  and  environmental  process. 

This  thesis  shall  extend  the  model  due  to  Kharoufeh  [19]  by  superimposing  an 
independent  shock  process  and  examining  the  failure  time  distribution  of  a  system 
subject  to  both  continuous  wear  and  random  shocks.  The  shocks  in  the  new  model 
occur  according  to  a  Poisson  process  with  intensity  A,  implying  the  times  between 
shocks  form  a  sequence  of  independent  and  identically  distributed  exponential  ran¬ 
dom  variables  with  rate  parameter  A.  The  homogenous  Poisson  process  is  denoted 
{N{t)  :  t  >  0},  where  N{t)  is  the  number  of  shocks  that  occur  in  a  time  interval 
of  length  t.  The  random  amount  of  damage  caused  by  the  fth  shock  is  denoted  by 
T).  It  is  assumed  that  the  sequence  of  random  variables,  {Kj}  is  an  independent  and 
identically  distributed  sequence  with  common  CDF  Fyiy)  =  P{Y  <  y}.  The  matrix 
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F d('u)  is  defined  to  be 


Fd{u)  =  diagiFriu)}, 

where 

^  ^OO 

Fy{u)  =  /  e-^yFridy) 

Jo 

denotes  the  Laplace-Stieltjes  transform  of  Fy  with  respect  to  y.  The  total  amount 
of  damage  attributed  to  random  shocks  up  to  time  t  is  the  sum  of  the  individual 
shocks  that  occur  prior  to  time  f,  and  is  given  by  the  random  variable, 

N(t) 

i=0 

so  that  {Y {t)  :  t  >  0}  forms  a  compound  Poisson  process 

Dehne  a  new  random  variable  X (t)  as  the  total  damage  accrued  by  the  system 
up  to  time  t.  The  total  cumulative  damage  up  to  t  is  the  sum  of  the  damage  caused 
by  the  independent  wear  and  shock  processes  so  that 

X{t)  =  W{t)  +  Y{t),  t>0. 

The  bivariate  stochastic  process  {{X(t),Z(t))  :  t  >  0}  completely  characterizes  the 
state  of  the  system  at  time  t.  The  random  variable  is  defined  as  the  failure  time 
of  a  system  subject  to  both  wear  and  random  shocks,  so  that 

T,  =  inf{f  :  X{t)  >  x}.  (3.1) 

A  possible  sample  path  of  the  bivariate  process  {{X(t),Z(t))  :  t  >  0}  is  illustrated 
in  Figure  3.2. 
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Figure  3.2  Sample  paths  of  the  compound  damage  and  environmental  process. 

This  process  {{X(t),  Z(t))  :  t  >  0}  characterizes  the  total  damage  incurred  by 
the  system  up  to  time  t,  as  well  as  the  state  of  the  ambient  environment  at  time  t. 
The  failure  time  distribution  is  shown  to  satisfy  a  system  of  linear,  hrst-order,  partial 
differential  equations.  The  system  is  then  solved  using  transform  methods,  yielding 
the  Laplace-Stieltjes  transform  of  the  unconditional  failnre  time  distribution  for  a 
system  subject  to  state-dependent  wear  and  random  shocks. 

3.2  Failure  Time  Distribution 

In  this  section  the  Laplace-Stieltjes  transform  of  the  unconditional  failure  time 
distribution  G^it)  :=  P{Tx  <  t}  is  derived.  The  failnre  time  distribntion  is  derived 
by  using  the  unique  relationship  between  the  hrst  time  to  failure  and  the  amonnt  of 
damage  sustained  up  to  time  t.  If  the  total  amount  of  damage  has  not  exceeded  the 
threshold  x  at  time  t,  then  the  system  must  fail  after  time  t, 

{X{t)  <x}  ^  {Ta-  >  t}. 
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This  relationship  implies  that  the  event,  “a  failure  occurs  at  or  after  time  t”  is 
equivalent  to  the  event,  “the  total  damage  at  time  t  is  less  than  or  equal  to  the 
threshold  x,”  therefore. 


P{T.  <t}  =  l-  P{X{t)  <  x}.  (3.2) 

Hence,  one  may  compute  the  failure  time  distribution  if  the  probability  distribution 
of  X{t)  is  known. 

The  marginal  probability  distribution  of  X{t)  is  derived  by  considering  the 
joint  probability  distribution  Y{x,t),  where  Y{x,t)  =  [Vij{x,t)],i,  j  G  S.  The  value 
Vij{x,t)  is  the  joint  conditional  probability  that  at  time  t,  the  degradation  of  the 
system  is  less  than  the  value  x,  and  the  environment  process  is  in  state  j,  given  the 
environmental  process  began  in  state  i, 


V,,,{x,t)  =  P{X{t)  <  x,Z{t)=j\Z{0)  =  e  (3.3) 

The  joint  probability  distribution  V {x,  t)  is  computed  by  examining  a  small 
time  interval  of  length  e  and  conditioning  on  the  state  of  the  system  at  time  t.  By 
doing  this,  the  distribution  V [x,  t)  is  proven  to  satisfy  a  system  of  partial  differential 
equations,  and  the  system  is  solved  using  well-known  methods.  In  order  to  solve  the 
system  of  equations  the  following  transform  definitions  are  needed.  First,  dehne  the 
matrix  Laplace  transform  with  respect  to  t  to  be 

poo 

V*{x,s)=  /  e-^W{x,t)dt, 

Jo 

and  the  matrix  Laplace-Stieltjes  transform  with  respect  to  x  to  be 

^  poo 

V*{u,s)=  /  e-^^V*{dx,s). 

Jo 
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Additionally,  the  initial  probability  vector  is  defined  to  be  ck  =  [a*],*  G  S  where 
=  P{Z(0)  =  i}.  Using  these  matrix  transforms  and  Equation  (3.2)  the  first 
main  result  of  this  thesis,  a  closed-form,  two-dimensional  transform  expression  for 
the  failure  time  distribution  is  derived. 

Theorem  3.1  If  the  operating  environment  is  a  finite,  continuous-time  Markov 
chain  with  infinitesimal  generator  matrix  Q,  and  shocks  occur  according  to  a  Poisson 
process  with  intensity  X,  then  the  two-dimensional  Laplace- Stieltjes  transform  of  the 
failure  time  distribution  is  given  by 

G*Js)  =  ^  -  a  ((s  +  A)I  +  uRn  -  XFn(u)  -  q) 1  (3.4) 

where  Re{u)  >  0,  Re{s)  >  0  and  1  is  a  K — dimensional  column  vector  of  ones. 
Proof.  Let  e  >  0,  then  by  definition 

Vij{x,t-\-€)  =  P{X{t e)  <  X,  Z{t e)  =  j\Z{0)  =  i}. 

Conditioning  on  the  state  of  the  ambient  environment  at  time  t  by  allowing  Z{t)  = 
k,k  E  S  and  summing  over  all  possible  states  in  S  one  may  write 

Vij{x,t  +  e)  =  ^P{X(t  +  e)  <  a;,Z(t +  e)  =  j|Z(t)  =  A;,Z(0)  =  i}P{Z(t)  =  A;}. 

k 

Using  the  relationship  between  the  joint  and  conditional  probability  the  expression 
is  rewritten  and 


Vij{x,t  +  e) 


=  +  e)  =  +  e)<x,  Z{t)  =  k,  Z(0)  =  i} 

k 

X  P{X{t  +  €)<x\Z{t)  =  k,Z{0)=i}P{Z{t)  =  k}.  (3.5) 
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The  environmental  process  {Z{t)  :  t  >  0}  is  independent  of  the  damage  process 
{X(t)  :  t  >  0}  implying  that 

P{Z{t+e)  =  3\X{t+e)  <  x,  Z{t)  =  k,  Z{0)  =  i}  =  P{Z{t+e)  =  j\Z{t)  =  k,  Z{0)  =  i}, 

and  Equation  (3.5)  becomes 

V,,,{x,t  +  e)  =  Y,P{Z{t  +  e)=j\Z{t)  =  k,Z{0)=z} 

k 

X  P{X{t  +  e)<x\Z{t)  =  k,Z{0)=i}P{Z{t)  =  k}.  (3.6) 

Recalling  that  P{Z{t  +  e)  =  j\Z(t)  =  k}  is  denoted  Pk,j{^)  and  {Z(t)  :  t  >  0}  is  a 
CTMC,  Equation  (3.6)  may  be  written  as 

Vi,j{x,t  +  e)  =  5^pfc,(e)P{X(t  +  e)<a;|Z(t)  =  A;,Z(0)  =  *}P{Z(t)  =  A;}. 

k 


Let  the  random  variable  Tfc(e)  be  the  total  damage  sustained  during  the  inter¬ 
val  (t,  t  +  e)  when  Z{t)  =  k.  The  damage  due  to  wear  is  a  linear  function  and  Tfc(e) 
can  be  expressed  as 

Tfc(e)=r(A;)e  +  F(e). 

Conditioning  on  the  number  of  shocks  that  occur  during  the  interval  of  length  e 
yields 

OO 

V,,{x,t  +  e)  =  j{e)P{X{t)  <x-  Tfc(e),  Z{t)  =  k\Z{{))  =  q  iV(e)  =  n} 

k  n=0 

X  P{X(e)  =  n}. 
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Let  jSn  =  X]r=i  expression  may  be  rewritten  as 


Vi^j{x,t  +  e)  =  y^^Pkjje)  {Vi,k{x  -  r(/c)e,  t)P{iV(e)  =  0} 

k 

+  Vi^k{x  -  r{k)e  -  pi,t)P{N{e)  =  1} 

+  V,,k{x-r{k)e-P2)P{N{e)  =  2} 

+  Vi^k{x  -  r{k)e  -  (3^,  t)P{N{e)  =  3}  +  ...)  • 

Since  {N{t)  :  t  >  0}  is  a  Poisson  process  (with  intensity  A),  it  is  well  known  [24] 
that 


P{N{e)  =  0}  =  l-Ae  +  o(e), 

P  {N  (^e)  =  1}  =  Ae  +  o(e), 

P{N(()>2}  =  o(€), 

where  o(e)/e  — 0  as  e  — 0.  Substituting  for  P{N{e)  =  n},n  =  0,1,2,...  and 
rearranging  terms  yields 


Vij{x,t  +  e)  =  '^Pk,j{e)  [Vi^k{x  -  r{k)e,t)  -  \eVi^k{x  -  r{k)e,t) 

k 

+XeVi^k{x  -  r{k)e  -  Yi,  t)]  +  o(e). 

The  magnitude  of  subsequent  shocks  form  an  independent  and  identically  distributed 
sequence  of  random  variables,  therefore  the  subscript  is  subsequently  omitted  and  the 
magnitude  of  the  first  shock  is  denoted  Y.  Breaking  apart  the  sum  and  conditioning 
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on  the  magnitude  of  the  shock,  V, 


Vij(x,t  +  e)  =  Y^pkjie)  [Vi^kix  -  r(k)e,t)  -  AeVi^k(x  -  r(k)e,t)] 

k 

poo 

+  ^/  Pk,j{e)XeVi^k{x  -  r{k)e  -  y,t)FY{dy)  +  o{e). 

k 


First,  the  case  A;  =  j  is  removed  from  the  sum  leaving 


Vij{x,t  +  e) 


=  {Vij{x  -  r(j)e,  t)  -  \eVij{x  -  r{j)e,  t) 

+  '^Pk,j{e)\yi^k{x  -  r{k)e,t)  -  \eVi^k{x  -  r{k)e,t) 


+  /  Pj,j{(^)>^(^Vi,j{x-r{j)e-y,t)FY{dy) 

Jo 

/oo 

Pk,j{e)XeVi^k{x  -  r{k)e  -  y,  t)FY{dy). 

- 


(3.7) 


Since  {Z{t)  :  t  >  0}  is  a  continuous-time  Markov  chain,  it  is  well  known  [24]  that 


Pfcj(e) 


1  +  gfcj  X  e  -h  o(e),  k  =  j 
gfcjXe-ho(e),  k^j. 


(3.8) 


Using  Equation  (3.8),  substituting,  distributing,  and  incorporating  terms  back  into 
Equation  (3.7),  one  obtains 


Vij{x,t  +  e)  =  Uj(a;  -  r(j)e,t)  -  XeVij{x  -  r{j)e,t) 


+ 

+ 

+ 

+ 


^  X]  (lk,j  {Vi^k{x  -  r{k)e,  t)  -  XeVi^k{x  -  r{k)e,  t)) 

k 


r{j)e 


Qk,j  X^i,k  i^X 


y,t)FY{dy) 

r{k)e-y,t)FY{dy) 


o(e). 
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Dividing  each  side  by  e  and  taking  the  limit  as  e  — 0,  gives 

ldj(a;,t  +  e)  Vij{x  -  r{j)e,t)  r 

— - -  =  hm — - - hmAvdj(a;  —  r(j)e,  t) 

+  -  r{k)e,t)  -  lim  qkjVi^k{x  -  r{k)e,t) 

k  k 

poo 

+  limA  /  Vij{x  -  r{j)e  -  y,t)FY{dy) 

Jq 

poo 

+  limeA^^  /  qkjVi^k{x  -  r{k)e  -  y,t)FY{dy) 

-0  ^Jo 

r  0(e) 

+  hm - . 

e^O  e 


Applying  the  dehnition  of  partial  derivatives  leaves 


dVij{x,t) 

Ft 


-  XVij(x,  t)  +  ^  qk,jVi^k(x,  t) 


limA  /  Vij(x  -  r(j)e  -  y,t)FY(dy). 

^^0  Jo 


By  Lebesgue’s  bounded  convergence  theorem,  the  right-most  term  of  Equation  (3.9) 
may  be  written  as 

poo  poo 

liinA  /  Vij(x  -  r(j)e  -  y,t)FY(dy)  =  \im  \  Vij(x  -  r(j) - y,t)FY(dy) 

e^o  Jo  Jo  n 

which  gives 


dVij{x,t) 

dt 


t)  +  ^  qk,jVi^k(x,  t) 


+  A  /  Vij(x  -  y,t)FY(dy). 


(3.10) 
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Recall,  R^)  =  diag{r-{l),  ...,r{K))  and  write  Equation  (3.10)  in  matrix  form  as 


=  -X\{x,  t)  +  \{x,  t)Q  +  AV(-,  t)  *  Fy{x)  (3.11) 

ot  ox 


where  (*)  denotes  the  convolution  operator.  Next,  the  Laplace  transform,  whose 
operator  is  denoted  C*,  of  Equation  (3.11)  is  taken  with  respect  to  t.  Applying  the 
differentiation  property  of  the  Laplace  transform. 


C* 


(9V {x,  t) 

Wt 


sV*(a;,s)  -  V(a;,0), 


yields  the  following  equation, 
dY*ix  si 

sV*(a;,s)  -1  + - =  -AV*(a;,s)  +  V*(a;,s)Q  + AV*(-,s)  *Fy(a;). 

(3.12) 


Next,  the  Laplace-Stieltjes  transform  of  Equation  (3.12)  is  taken  with  respect  to  the 
spatial  dimension, 

sY*(u,  s)  -  I  +  uY*(u,  s)Rd  =  -AV*(m,  s)  +  V*(m,  s)Q  +  XV*{u,  s)Fy{u),  (3.13) 

where  Fy{u)  is  the  Laplace-Stieltjes  transformation  of  the  cumulative  probability 
distribution  of  the  shock’s  magnitude, 

^  ^OO 

Fy{u)  =  /  e-^^FY{dx). 

Jo 

Rearranging  the  terms  of  Equation  (3.13)  yields 

V*(m,  s)  ^sI  -|-  uRd  -|-  AI  —  Q  —  AF d(w)  j  =  I, 
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where  F d{u)  is  &  K  x  K  diagonal  matrix  with  each  diagonal  element  equal  to  Fy{u). 
Finally,  the  two-dimensional  transform  of  the  joint  probability  distribution  is  given 
by 

V*(m,  s)  =  ^(s  -|-  A)I  -|-  uHd  —  d{u)  —  ,  Re{u)  >  0,  Re{s)  >  0. 


The  failure  time  distribution  is  obtained  as  a  two-dimensional  transform  by  using 
the  dual  relationship  of  of  Equation  (3.2)  and  the  initial  distribution  vector  cc, 


G^it)  =  1  -  Q:V(a;,f)l, 

G*(s)  =  1/s  -  Q:V*(a;,  s)l, 

G:(s)  =  l/s-aV*(«,s)l, 

=  1/s  —  CK  -|- A)I -|- mRd  —  AF£)(m)  —  1 


with  Re{u)  >  0,  Re{s)  >  0,  and  1  is  a  dimensional  column  vector  of  ones. 


3.3  Dimensional  Reduction 

The  two-dimensional  transform  solution  of  Equation  (3.4)  may  be  inverted 
numerically  to  approximate  the  solution  in  the  original  temporal  and  spatial  domains. 
However,  the  existing  techniques  for  two-dimensional  numerical  inversion  are  more 
difficult  to  implement  than  one-dimensional  inversion  algorithms.  Furthermore,  the 
two-dimensional  techniques  are  computationally  expensive,  and  unstable,  thus,  in 
this  section  the  two-dimensional  transform  solution  is  reduced  to  a  single  transform 
dimension.  Reducing  the  solution  to  a  single  transform  variable  will  immensely 
simplify  the  numerical  inversion,  making  the  solution  more  powerful. 
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Kharoufeh  and  Sipe  [20]  were  able  to  reduce  a  two-dimensional  transform  result 
for  a  state-dependent  wear  process  (excluding  shocks)  by  converting  their  original 
matrix  partial  differential  equation  into  an  ordinary  differential  equation  (ODE),  and 
solving  the  ODE  using  an  integrating  factor.  In  the  spirit  of  the  approach  of  [20], 
the  one-dimensional  solution  for  our  model  is  derived. 

Theorem  3.2  If  the  operating  environment  is  a  finite,  continuous-time  Markov 
chain  with  infinitesimal  generator  matrix  Q,  and  shocks  occur  according  to  a  Pois¬ 
son  process  with  intensity  then  the  Laplace- Stieltjes  transform  of  the  failure  time 
distribution  with  respect  to  x  is 

Guit)  =  1  -  CKexp  ^(Q -h  A(F£i(m)  -  I)  -  MRD)tj  1  (3.14) 

with  Re{u)  >  0. 

Proof.  The  result  is  derived  by  beginning  with  the  matrix  partial  differ¬ 
ential  Equation  (3.11), 

=  _AV(a;, t)  +  V(a;, t)Q  +  AV(-, t)  *  Fy{x). 

ot  ox 

Eirst,  take  the  Laplace-Stieltjes  transform  with  respect  to  x,  where  the  symbol  C 
denotes  the  Laplace-Stieltjes  transform  operator, 

”  £{-AV(i,()+V(i,()Q  +  AV(-,«)»FrW}. 

(3.15) 

Using  the  relationship  between  the  Laplace  and  Laplace-Stieltjes  transform,  namely 

=  uC*  {f{x)}  ,  (3.16) 
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it  is  clear  that  Equation  (3.15)  can  be  written  as 


dV  (n,  t) 
dt 


+  uC* 


—\uV*{u,  t)  +  uV*{u,  t)Q  +  \uV*{u,  t)FY{u). 

(3.17) 


The  Laplace  transform  with  respect  to  x  of  the  partial  derivative  of  the  function 
V  (x,  t)  is  known  to  be 

=  -  V(0,t).  (3.18) 


Using  the  differentiation  property  of  the  Laplace  transform  in  Equation  (3.18),  Equa¬ 
tion  (3.17)  becomes 


dV  (u,  t) 
dt 


+  u  {uV*{u,  t)  —  V (0,  t))  Rd  =  —\uV*{u,  t)  -|-  uV*{u,  t)Q  +  \uV*{u,  t)FY{u). 


Rearranging  terms  yields, 

+«V*(u,t)  (uRg  +  AI  -  Q  -  AFo(u))  =  0, 

where  FD(n)  is  dehned  as  before.  Using  the  relation  in  Eqnation  (3.16), 

(3.19) 


(3.20) 


dV  (m,  t) 
dt 


-|- V(u,  t)  ( uR^) -|- AI  —  Q  —  AF£)(-u)  )  =  0 


Dehne  fi(t)  as  the  integrating  factor 


fi(t)  =  exp  (  /  ( uR^)  4- AI  —  Q  —  AF£)(u)  )  dt  )  , 


=  exp 


mRd  -|-  AI  —  Q  —  AF d{u))  t 
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For  brevity,  substitute 


C  —  uR,£)  +  AI  —  Q  —  AF  £){u)^ 

and  multiply  both  sides  of  Equation  (3.19)  by  the  integrating  factor  of  Equation 
(3.20) 

- exp(Ct)  +  V (u,  t)C  exp(Ct)  =  0, 

at 

d(V(u,t)exp(Ct))  ^  ^ 

The  general  solution  of  the  ordinary  differential  equation  is  obtained  by  integrating 
both  sides  of  Equation  (3.21)  with  respect  to  t, 

V  (u,  t)  exp(Ct)  =  'll), 

where  is  a  matrix  of  constants  of  integration.  Applying  the  initial  condition 
V  (u,  0)  =  I,  it  is  clear  that 


=  I. 


The  specihc  solution  is  obtained  by  reverse  substituting  for  C  and  i/^, 


V {u,  t)  exp  ( uHd  +  AI  —  Q  —  AF d{u)  )  t 


I. 


(3.22) 


Finally,  rearranging  the  terms  of  Equation  (3.22)  yields  the  one-dimensional  Laplace- 
Stieltjes  transformation  of  the  joint  probability  distribution. 


\{u,t)  =  exp  I  Q  +  A(F£)(m)  —  I)  —  uR^) )  t 


(3.23) 
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Using  Equation  (3.2)  the  Laplace-Stieltjes  transform  of  the  unconditional  failure 
time  distribution  is 

Gu{t)  =  l-cxY{u,t)l.  (3.24) 

Equations  (3.23)  and  (3.24)  imply  the  Laplace-Stieltjes  transform  of  the  failure  time 
distribution  is  given  by 

G„(t)  =  1  —  CK  exp  (^Q  +  \{F £){u)  —  1)  —  uHd^  t  1.  (3.25) 

■ 

There  exists  a  myriad  of  methods  to  numerically  invert  the  one-dimensional 
Laplace-Stieltjes  transform  of  Equation  (3.14).  One  possible  method  would  be  to 
convert  it  to  a  Laplace  transform  using  Equation  (3.16)  and  then  use  the  algorithm 
of  Abate  and  Witt  [1].  In  the  next  section,  the  one-dimensional  Laplace-Stieltjes 
transform  of  Equation  (3.14)  is  used  to  derive  a  closed-form  expression  for  the  nth 
moment  of  the  failure  time  distribution. 

3.4  Computing  Unconditional  Moments 

In  this  section  an  analytical  expression  for  the  nth  moment  of  the  failure  time 
distribution  of  a  system  subject  to  wear  and  shocks  is  derived.  The  nth  moment  is 
computed  by  evaluating  the  nth  derivative  of  the  Laplace-Stieljes  transform  of  Gu(t) 
with  respect  to  t  at  the  value  zero.  That  is, 

m„(n)  =  (-1)"— (£|G„(t)})  .  (3.26) 
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The  Laplace-Stieltjes  transform  of  the  failure  time  distribution  is 


Gu{t)  =  1  — aexp  ( Q  +  A(F£)(m)  —  I)  —  mRd  )  t 


1. 


Dehne  the  function  to  be 


=  1  —  exp  (  Q  +  A(F oiu)  —  I)  —  uKd  ]  t 


and  compute  the  LST  of  ^u(t)  with  respect  to  t.  Substitute 


A  =  Q  +  A(F  d{u)  —  I)  —  uR 


■D 


for  brevity  to  obtain 


Jo 

poo 

=  -A  / 

Jo 


(3.27) 


Using  fundamental  matrix  calculus  (et.  Neuts  [29]),  the  integral  of  Equation  (3.27) 
is  evaluated  as 


^u{s)  =  -A{sl-A)-\ 

=  i-sl  +  {sl-  A)){sl-  A)-\ 

=  -sl(sl- A)-^  +  (sI- A)(sl- A)-\ 

=  I-s(sI-A)-h  (3.28) 

In  the  following  lemma  a  closed-form  solution  for  the  nth  partial  derivative  of 
$  t,(s)  is  provided. 
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Lemma  3.1  The  nth  partial  derivative  of  the  function  $«(«)  is 


=  (-irn!(sl  -  A)-"  (-s(sl  -  A)-‘  +  I)  .  (3.29) 

Proof.  The  lemma  is  proven  using  a  mathematical  induction  argument. 
First,  Equation  (3.29)  is  shown  to  hold  when  n  =  0, 1,  and  2.  A  simple  substitution 
shows  the  result  holds  for  n  =  0.  The  first  partial  derivative  of  $u('S)  is 

=  (sI-A)-"(I)(sI)-(sI-A)-‘, 

=  s(sl  -  A)-2  -  (si  -  A)-\ 

and  the  second  partial  derivative 
(9^ 

^#„(s)  =  -2s(sl- A)-3  +  (sI-A)-2  +  (sI- A)-2 
=  -2s(sl- A)-3  +  2(sI- A)-2. 

For  the  inductive  step,  the  result  is  assumed  to  hold  for  some  integer  k  >  2  and  is 
proven  true  for  k  +  1.  Using  the  inductive  hypothesis 

nfc+l  a 

=  ^((-l)‘H(i,I-A)-'=(-s(sI-A)->+I)) 

=  f-  -  A)-<‘’+">  +  (-l)‘H(sI  -  A)-‘) , 

OS  ^  ' 

=  (-l)^+iA;!(sI  -  +  s(-l)^+2(A;  +  l)!(sl  - 

+  A;(-l)^+^A;!(sI- A)-(^+^i, 

=  +  l)!(sl  -  +  {k  +  l)(-l)''+^A;!(sI  - 

=  s{-l)^+\k  +  l)!(sl  -  A)-(^+2)  +  {-l)^+\k  +  l)!(sl  -  A)-(^+ii, 

=  {-l)’^+\k  +  l)!(sl  -  A)-(^+ii  (-s(sl  -  A)-i  +  I)  . 
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Lemma  3.2  The  nth  partial  derivative  of  the  function  evaluated  at  s  =  0  is 

— =  (-l)"n!(MRz,  -  Q  -  A(Fz5(m)  -  I))-^ 

s=0 

The  proof  follows  directly  from  Equation  (3.29)  by  substituting  s  =  0.  Next,  a 
closed-form  analytical  expression  for  the  one-dimensional  Laplace-Stieltjes  transform 
of  nth  moment  of  the  failure  time  distribution  is  derived  using  the  results  of  Lemma 
3.1,  Lemma  3.2  and  Equation  (3.26). 

Theorem  3.3  If  the  operating  environment  is  a  finite,  continuous-time  Markov 
chain  with  infinitesimal  generator  matrix  Q,  and  shocks  occur  according  to  a  Poisson 
process  with  intensity  then  the  Laplace-Stieltjes  transform  of  nth  moment  of  the 
failure  time  distribution  with  respect  to  x  is 

rhn{u)  =  nlcx.  (uHd  —  Q  —  d{u)  —  1)^  1,  (3.30) 

with  Re{u)  >  0. 

Proof.  Using  Equation  (3.26)  it  is  known  that 

fhn{u)  =  (-I)'"  a— #„(s)  1. 

Using  Lemmas  3.1  and  3.2  the  hnal  result  is  obtained  after  rearranging  terms, 

nin{u)  =  {-lf'^n\oL{-A)~'^l, 

=  nlcK  —  Q  —  A(F£)(n)  —  I)  j  1. 

■ 

Thus  far,  only  measures  of  reliability  have  been  addressed.  In  the  next  section 
an  inspect-and-replace  maintenance  policy  is  dehned,  and  an  analytical  expression 
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for  the  system’s  long-run  availability  under  such  a  policy  is  derived  using  results 
from  renewal  theory. 

3.5  Availability  Measures 

The  reliability  measures  derived  in  the  previous  sections  allow  analyst  to  deter¬ 
mine  the  probability  a  system  will  fail  at  or  before  some  arbitrary  time  t.  However, 
there  remains  no  tractable  way  to  measure  this  systems  availability.  Calculating  the 
instantaneous  or  average  availability  of  the  system  is  of  limited  use,  in  most  appli¬ 
cations  it  is  sufficient  to  consider  the  long-run  behavior  of  the  system.  The  long-run 
availability  of  a  system  is  dehned  to  be  the  limit  of  the  system’s  average  availability, 
A{t),  as  the  interval  tends  to  inhnity, 

A  =  lim  A(t)  <  oo. 

>oo 

In  this  section  the  long-run  availability  of  a  system  subject  to  state-dependent 
wear  and  random  shocks  is  derived  for  an  inspect-and-replace  maintenance  policy. 
An  inspect-and-replace  maintenance  policy  is  one  in  which  the  system  is  inspected 
at  intervals  of  r  time  units.  If  upon  inspection  the  system  has  failed,  it  is  replaced 
with  a  new  and  identical  unit;  however,  if  the  system  is  found  to  be  operative,  then 
no  action  is  taken.  The  inspections  are  assumed  to  always  correctly  diagnose  the 
system’s  condition. 

To  derive  the  long-run  availability  under  these  assumptions  one  must  consider 
the  stochastic  process  {d'(t)  :  t  >  0},  where 

if  X{t)  <  X. 
otherwise 
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This  regenerative  process,  often  called  an  up-down  machine,  has  been  studied  exten¬ 
sively  [2],  [24],  A  possible  sample  path  of  the  process  is  depicted  in  Figure  3.3. 


Figure  3.3  Sample  path  of  an  up-down  machine. 


Dehne  the  random  variable  Tx^  to  be  the  lifetime  of  the  ith  system,  so  that  the 
sequence  of  random  lifetimes  ...}  is  independent  and  identically  distributed 

with  a  common  mean  E\T^.  The  Laplace-Stieltjes  transform  of  their  distribution 
is  given  by  Equation  (3.14).  Furthermore,  let  the  random  variable  T*  be  the  ith 
inter-replacement  time.  The  long-run  availability  of  this  system  is  derived  using  a 
well-known  result  from  renewal  theory,  but  hrst,  a  preparatory  lemma,  as  well  as 
two  minor  propositions,  must  be  proved. 


Proposition  3.1  The  maximum  lifetime  of  a  system  subject  to  continuous,  state- 
dependent,  linear  wear  and  a  Poisson  shock  process  is 


A 


X 

min{r(i)  :  i  E  S} 


(3.31) 


Proof.  Without  loss  of  generality,  assume  NiT^)  =  0  and  recall  that  N(T,i.) 
denotes  the  random  number  of  shocks  that  occur  in  the  interval  [0,  Tj,].  Clearly,  the 
system  will  operate  longer  if  it  accrues  damage  at  the  lowest  possible  rate,  min{r(i)  : 
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i  G  5'}.  Therefore,  the  sample  path  that  achieves  the  longest  possible  lifetime  is  the 
one  for  which  the  system  occupies  the  most  benign  environment  throughout  its  life. 
If  the  system  begins  in  state  j,  where  min{r(i)  :  i  E  S}  =  r{j),  and  remains  in  state 
j  until  failure,  then  its  lifetime  is  precisely 


T. 


X 

r{j) 


A. 


In  case  the  system  does  experience  shocks  (i.e.,  N{Tx)  >  0)  the  unit’s  lifetime  must 
be  shorter  than  A  because  shocks  are  detrimental  to  the  system.  ■ 

The  second  proposition  follows  directly  from  Proposition  3.1. 


Proposition  3.2  For  any  e  >  0 


G„(A  +  e)  =  l.  (3.32) 

Proof.  Proposition  3.1  guarantees  that  the  system  will  fail  at  or  before 
t  =  A,  therefore, 

P{T^  <  A}  =  1  ^  P{T^  <  A  +  e}  =  1.  (3.33) 


The  needed  lemma  may  now  be  proven  using  Propositions  3.1  and  3.2. 
Lemma  3.3  The  infinite  series  {— Ga;(A;r)}^g  is  convergent  with  sum 


oo  7—1 

'^-G^ikr)  = 'J (3.34) 

k=0  k=0 

where 

7  =  min{n  >  1  :  nr  >  A}. 
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Proof.  Consider  the  infinite  series 


^  ^  —Gxjkr)  —  /  dGx{t)-\-2  /  dGx{f)-\-?)  /  + 


(3.35) 


The  right  side  of  Equation  (3.35)  may  be  rewritten  in  the  following  way: 


J2-Gx{ki 


[Gx{'^)  ~  Ga;(0)]  +  2[Gx{2t)  —  Gx{t)]  +  ' ' '  +  '^[Gx{'yT)  —  Gx{{'^  —  l)'r) 


+  (7  +  1)[G^x((7  +  I)'?')  ~  Ga;((7)'r)]  +  •••, 


(3.36) 


where 


7  =  min{n  >  1  :  nr  >  A}. 


(3.37) 


Equations  (3.32)  and  (3.37)  imply  that  for  all  t  >  yr, 


Gx{t)  =  1. 


(3.38) 


Equations  (3.36)  and  (3.38)  imply 


X;  -GAkr)  = 

+[Gx{t)  —  Gx(0)]  +  2[Ga;(2r)  —  Gx{t)]  +  ■  ■  ■  +  7(1  —  Gx{{'^  —  1)'?")] 

+(7  +  l)[l-l]  +  ...  (3.39) 


Rearranging  the  terms  of  Equation  (3.39)  yields 


J2-Gx{kT)  =  -G,(0)-G,(r)-G,(2r)-...-(7-l)G',((7-l)r)+7, 


=  7 


Y,G.(kT). 


(3.40) 
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The  main  result  of  this  section  is  now  presented. 


Theorem  3.4  If  the  operating  environment  is  a  finite,  continuous-time  Markov 
chain  with  infinitesimal  generator  matrix  Q,  and  shocks  occur  according  to  a  Poisson 
process  with  intensity  then  the  long-run  availability  under  an  inspect- and-replace 
maintenance  policy  where  an  inspection  occurs  every  r  time  units  is 

|a  (uRd  -  Q  -  A(Fz5(u)  -  I))~'  1 


(3.41) 


where  Guifi)  is  the  Laplace- Stieltjes  transform  of  Eguation  (3.1  f)  and  C  ^  denotes 
the  inverse  Laplace- Stieltjes  operator. 

Proof. 

The  long-run  availability  of  an  up-down  regenerative  process  is  known  to  be 
(cf.  [2]) 


A  =  lim  A(t) 

>oo 


E%] 

E[T]  ’ 


(3.42) 


where  E[Ti]  =  E[T2]  =  ■  ■  ■  =  E[T] 

The  Laplace-Stieltjes  transform  of  the  mean  lifetime  for  a  system  subject  to 
state  dependent  wear  and  a  Poisson  shock  process  is  computed  by  evaluating  Equa¬ 
tion  (3.30)  at  n  =  1,  therefore 

E[T,]  =  £-1  |a  (uRd  -  Q  -  A(F^(u)  -  1))“'  l| .  (3.43) 
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An  analytical  expression  for  E\T]  is  next  derived. 


E[T] 


E[E[T\T,]i 


E[T\T,  =  t]dG,{t), 


(3.44) 


where  Gx{t)  =  P{Tx  <  t}.  Because  a  failure  will  always  be  found  on  the  first 
inspection  after  it  occurred,  the  conditional  expectation  is  given  by 


E[T\T,  =  t] 


r,  0  <  t  <  T 
2r,  T  <t  <  2t 
<  : 

kr,  {k  —  l)r  <t<kT 


V 


(3.45) 


Using  Equations  (3.44)  and  (3.45),  one  may  write 

oT  i*2T  p3t 

E[T]  =  'T  /  dGxit)  +  2r  /  dGxit)  +  3r  /  dGxit)  +  ■  ■  •  , 

Jo  Jt  J2t 

=  'r[Gx('r)  —  Gx(0)]  +  2t[Gx{2t)  —  Gx{t)]  +  3r[Ga;(3r)  —  Gx{2t)]  +  ■  •  •  , 


—  —t{Gx{t)  +  Gx{2t)  +  Gx{3t)  +  ---), 

OO 

=  -ry^GxjkT). 


(3.46) 


fc=0 


The  inhnite  series  of  Equation  (3.46)  is  known  to  converge  by  Lemma  3.3,  therefore 


Blr|=r(^7-5^G,(«:r)) 

V  k=0  / 
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The  Laplace-Stieltjes  transform  of  the  failure  time  distribution  is  known,  therefore 
E\T]  may  be  computed  by  evaluating  the  inverse  Laplace-Stieltjes  transform 


G,{kT)  =  C-^  [Gu{kT)Y 


Substituting  the  appropriate  quantities  into  Equation  (3.42)  the  final  result  is  ob¬ 
tained, 


A 


jo:  -  Q  -  A(F£,(n)  -  I)j  1 

^  Y  -  EIZi 


(3.47) 


The  Laplace-Stieltjes  transform  of  the  long-run  availability  derived  in  this  sec¬ 
tion  is  easily  inverted  numerically  using  the  algorithm  of  [1].  Moreover,  this  result 
can  be  used  to  quickly  compare  the  availability  of  a  system  under  competing  inspec¬ 
tion  policies,  ultimately  Ending  an  optimal  inter-inspection  time  that  will  maximize 
the  long-run  availability. 

3. 6  Summary 

This  chapter  has  provided  results  for  the  reliability  and  long-run  availability 
of  a  system  subject  to  continuous,  state-dependent,  linear  wear  and  random  shocks. 
Specifically,  the  failure  time  distribution  of  a  system  subject  to  this  compound  dam¬ 
age  process  was  first  derived  as  a  two-dimensional  Laplace-Stieltjes  transform  and 
then  reduced  to  a  single  dimension.  Next,  the  moments  of  that  distribution  were 
derived  using  the  one-dimensional  result  of  section  3.3.  Finally,  an  expression  for  the 
Laplace-Stieltjes  transform  of  the  long-run  availability  under  an  inspect-and-replace 
maintenance  policy  was  derived.  The  utility  of  these  results  will  be  demonstrated  in 
chapter  4. 
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4.  Numerical  Results 


In  this  chapter  the  main  results  of  chapter  3  will  be  illustrated  on  hve  example 
problems.  For  each  problem,  a  numerical  approximation  of  the  analytical  failure 
time  distribution  is  computed,  as  are  the  mean  and  variance  of  the  failure  time 
distribution.  Finally,  the  long-run  availability  is  computed  for  each  example  using 
an  inspect-and-replace  maintenance  policy.  Throughout  this  chapter,  the  analytical 
transform  results  are  inverted  numerically  using  the  algorithm  of  Abate  and  Whitt 
[1],  and  compared  to  the  same  measures  obtained  by  simulating  the  system.  The 
simulation  model  used  in  this  chapter  is  explained  in  the  next  section 

4  ■  1  Simulation 

For  each  problem,  a  computer  simulation  of  the  compound  damage  process  was 
created  using  Arena  ®,  a  high-level  commercial  simulation  package.  The  system  is 
simulated  in  the  following  manner:  Two  entities  are  created  at  time  t  =  0.  The 
hrst  entity  travels  through  the  environmental  process,  transitioning  from  state  to 
state  according  to  the  appropriate  CTMC.  The  time  spent  in  each  state  is  recorded, 
and  the  wear  accumulated  for  each  sojourn  time  is  calculated.  After  each  transition, 
the  total  wear  is  calculated.  Meanwhile,  the  second  entity  emulates  the  Poisson 
shock  process.  Each  time  a  simulated  shock  occurs,  the  magnitude  is  recorded  and 
the  total  damage  incurred  by  the  shocks  is  computed.  Using  the  random  variables 
W (t)  and  Y  (t)  the  simulation  calculates  the  total  damage  incurred  by  the  system.  A 
simple  comparison  logic  is  used  to  determine  if  X  (t)  has  exceeded  the  system’s  failure 
threshold  value  .  If  the  threshold  has  been  breached,  then  the  time  the  threshold 
was  exceeded  is  determined;  otherwise,  the  simulation  continues. 

When  the  simulated  system  fails,  the  failure  time  is  recorded  and  the  replication 
is  terminated.  The  real  system,  without  maintenance,  is  assumed  to  be  replaced  at 
failure  with  a  new  and  identical  unit.  Therefore,  the  next  replication  initializes  the 
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system,  and  runs  in  the  same  manner.  Using  this  technique,  100,000  independent 
replications  were  conducted.  A  Student’s  t-Test  determined  that  100,000  replications 
produces  a  sufficiently  small  conhdence  interval  about  the  true  reliability  measures. 
Finally,  the  associated  failure  times  were  used  to  compute  an  empirical  distribution 
function  and  other  reliability  measures. 


4-2  Lifetime  of  an  Electric  Battery 

The  hrst  example  considers  the  lifetime  of  an  electric  battery  used  in  a  diag¬ 
nostic  test  set.  The  process  begins  when  a  fully  charged  battery  is  placed  into  service 
at  time  t  =  0.  The  test  set  is  considered  to  be  operating  when  a  piece  of  equipment 
is  tested,  and  idle  when  no  test  is  being  performed.  For  this  process,  the  test  set 
may  be  in  two  distinct  states:  operating  or  idle.  Hence,  the  state  space  is  coded  as 

S  =  {operating,  idle}  =  {1,2}. 


When  the  test  set  is  operating,  it  requires  power  at  a  rate  of  r(l)  =  65/60  units/hour, 
and  when  it  is  idle  it  requires  power  at  a  rate  of  r(2)  =  1/4  units/hour.  The  battery 
continuously  supplies  power  at  a  linear  rate,  determined  by  the  state  of  the  diagnostic 
test  set.  Furthermore,  the  time  to  complete  a  diagnostic  test  is  an  exponential 
random  variable  with  a  rate  parameter  of  25/3  tests/hour,  as  is  the  time  between 
tests.  Dehne  the  random  variable  Z{t)  to  be  the  state  of  the  test  set  at  time  t, 
then  the  stochastic  process  {Z{t)  :  t  >  0}  can  be  characterized  by  a  continuous-time 
Markov  chain  (CTMC),  whose  inhnitesimal  generator  matrix  is 


Q 


-25/3  25/3 

25/3  -25/3 


Additionally,  the  battery  must  supply  power  to  an  additional  piece  of  diag¬ 
nostic  equipment.  The  second  piece  of  equipment  is  independent  of  the  test  set. 
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and  requires  instantaneous  power.  The  power  demands  of  this  piece  of  equipment 
can  be  characterized  as  a  random  shock  process.  The  demands  occur  via  a  Poisson 
process  with  rate  0.5  demands/hour.  The  magnitude  of  the  power  demand  is  an 
exponentially  distributed  random  variable  with  a  mean  of  0.25  units. 

Dehne  the  random  variable  X  [t)  to  be  the  total  power  consumption  up  to  time 
t,  and  assume  the  battery  has  a  hnite  power  capacity  of  1  unit.  Finally,  assume 
the  test  set  begins  in  state  1  {Z{0)  =  1)  with  probability  1.  Using  the  bivariate 
stochastic  process  {{X(t),  Z(t))  :  t  >  0}  and  the  results  of  chapter  3,  the  reliability 
of  the  battery  may  be  computed.  The  necessary  matrices  are: 


and 


R 


D 


Fn(u)  = 


I  0 
0  I 


—  0 
4+-U  ^ 

0  — 
4+u 


a.  = 


1  0 


The  battery  will  continue  to  provide  power  until  the  total  power  required  by  the 
two  pieces  of  equipment  exceeds  the  total  capacity  of  the  battery.  Equation  (3.14) 
is  used  to  calculate  the  Laplace-Stieltjes  transform  of  the  failure  time  distribution, 


Gu{t) 


1  - 


1  0 


X 


exp 


( 

-25 

3 

25 

3 

4 

4+^ 

0 

1 

0 

65 

60 

0 

[ 

25 

3 

-25 

3 

\ 

0 

4 

4+^ 

0 

1 

) 

0 

1 

4 

I  J 

X 


1 

1 


(4.1) 
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The  Laplace-Stieltjes  transform  of  Equation  (4.1)  is  converted  to  a  Laplace  transform 
and  then  approximated  at  various  values  of  t  using  the  algorithm  of  [1] .  The  analyt¬ 
ical  distribution  is  shown  in  Table  4.1,  as  well  as  the  empirical  distribution  from  the 
Monte-Carlo  simulation.  The  maximum  absolute  deviation  (MAD)  in  probability  is 
0.001492.  The  quality  of  the  approximation  is  also  depicted  in  Figure  4.1,  where  the 
empirical  CDF  and  the  approximated  analytical  CDF  are  both  plotted  against  time. 


Table  4.1  Analytical  versus  empirical  CDFs  for  a  battery  lifetime. 


t 

Analytical 

Simulated 

Deviation 

t 

Analytical 

Simulated 

Deviation 

0.0 

0.000000 

0.000000 

0.000000 

1.6 

0.821579 

0.822560 

0.000981 

0.1 

0.001390 

0.001180 

0.000210 

1.7 

0.891184 

0.891040 

0.000144 

0.2 

0.003916 

0.003670 

0.000246 

1.8 

0.937873 

0.936730 

0.001143 

0.3 

0.008151 

0.007740 

0.000411 

1.9 

0.966733 

0.965800 

0.000933 

0.4 

0.014840 

0.013970 

0.000870 

2.0 

0.983266 

0.982680 

0.000586 

0.5 

0.024439 

0.023700 

0.000739 

2.1 

0.992083 

0.991840 

0.000243 

0.6 

0.037958 

0.036820 

0.001138 

2.2 

0.996475 

0.996420 

0.000055 

0.7 

0.057100 

0.056500 

0.000600 

2.3 

0.998522 

0.998410 

0.000112 

0.8 

0.083479 

0.083160 

0.000319 

2.4 

0.999417 

0.999260 

0.000157 

0.9 

0.118717 

0.117990 

0.000727 

2.5 

0.999783 

0.999700 

0.000083 

1.0 

0.169900 

0.170260 

0.000360 

2.6 

0.999924 

0.999870 

0.000054 

1.1 

0.247602 

0.248730 

0.001128 

2.7 

0.999975 

0.999930 

0.000045 

1.2 

0.354030 

0.353530 

0.000500 

2.8 

0.999992 

0.999990 

0.000002 

1.3 

0.480028 

0.479350 

0.000678 

2.9 

0.999998 

0.999990 

0.000008 

1.4 

0.609480 

0.608730 

0.000750 

3.0 

0.999999 

1.000000 

0.000001 

1.5 

0.726675 

0.727070 

0.000395 

3.1 

1.000000 

1.000000 

0.000000 
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Figure  4.1  Cumulative  distribution  function  of  Tx  for  a  battery  lifetime. 


Next,  the  analytical  mean  and  variance  are  compared  with  those  of  the  simu¬ 
lated  data.  The  mean  time  to  failure  (MTTF)  of  the  100,000  simulated  failure  times 
is 


E[fx] 


1 

100000 


100000 


1.298323, 


2=1 


where  Tx^  is  the  simulated  lifetime  of  the  fth  replication.  The  variance  of  the  failure 
times,  Var{Tx),  is 


Var{fx) 


100000 


100000 

Z 

2=1 


T 

■T 


0.121089. 


The  same  measures  are  now  obtained  by  numerically  inverting  the  analytical 
result  of  Equation  (3.30), 


mn{u) 


n! 


1  0 


(4 


X 


4 

4+22 

0 


0 

4 

4+22 
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Again  the  algorithm  of  [1]  readily  inverts  Equation  (4.2),  and  the  results  appear  in 
Table  4.2 


Finally,  assume  battery  failures  are  hidden  and  that  the  equipment  completes 
a  self-diagnostic  test  every  r  =  0.1  hours.  If  the  battery  has  failed,  then  it  is  instan¬ 
taneously  replaced.  If  the  inspection  Ends  the  battery  is  not  failed  then  no  action 
is  taken.  The  long-run  availability  of  the  battery  is  examined  under  this  inspect- 
and-replace  maintenance  policy.  Equation  (3.31)  defines  the  battery’s  maximal  life 
to  be 

X  1 

^  min{r(i)  :  i  E  S}  0.25’ 

Equations  (3.37)  and  (4.3)  imply  that  7  =  40.0,  therefore  using  Equation  (3.41)  the 
long-run  availability  is  given  by 


A 


a  (uRn  —  Q  —  A(Fd(u)  —  I)'j  1 

0.1  (40.0 -Ef=oG.(0.1A;)) 


(4.4) 


The  first  moment  (the  numerator)  was  computed  using  Equation  (4.2).  The  Laplace- 
Stieltjes  transform  of  the  cumulative  distribution  function  was  obtained  by  Equation 
(4.1).  The  numerator  and  denominator  of  (4.4)  were  both  inverted  numerically  using 
the  algorithm  of  [1].  The  numerical  approximation  of  the  long-run  availability  is 


A  =  0.962948. 


Recall  that  the  long-run  availability  is 

where  E\T]  is  the  expected  time  an  inspection  will  discover  a  failure.  The  mean  fail¬ 
ure  and  replacement  times  for  the  100,000  replications  are  1.298323318  and  1.348338, 
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respectively, 
ability  is 


Using  these  values  and  Equation  (4.5),  the  simulated  long-run  avail- 


A 


1.298323318 

1.348338 


0.962906. 


The  analytical  and  simulated  measures  of  reliability  are  summarized  in  Table  4.2 


Table  4.2  Comparison  of  reliability  measures  for  a  battery. 


Measure 

Analytical 

Simulation 

Deviation 

MTTF 

1.297626 

1.298323 

0.000698 

Variance 

0.121231 

0.121089 

0.000142 

Availability 

0.962948 

0.962906 

0.000042 

4-3  Fuel  Consumption  Model 

The  second  example  considers  the  fuel  consumption  of  an  F-16  fighter  jet.  The 
F-16  Fighting  Falcon  is  a  compact,  multirole  fighter  designed  to  engage  enemy  air¬ 
craft  in  air-to-air  combat  and  to  attack  ground  targets.  The  rate  at  which  the  jet 
consumes  fuel  depends  on  its  flight  prohle.  The  high  speeds  and  constant  maneu¬ 
vering  of  aerial  combat  require  more  fuel  than  when  the  F-16  is  executing  a  ground 
attack  mission.  Use  of  the  engine’s  afterburner  also  causes  fuel  to  be  expended.  The 
aircraft  can  carry  fuel  internally  and  in  external  drop  tanks.  In  this  example,  the 
results  of  this  thesis  are  used  to  compute  the  probability  that  the  aircraft  will  use 
all  of  its  external  fuel  before  an  arbitrary  time  t. 

Dehne  the  random  variable  Z (t)  to  be  the  flight  environment  of  the  aircraft  at 
time  t,  so  that  the  sample  space  is  coded  as 

S  =  {air-to-ground  attack,  air-to-air  combat,  normal  flight}  =  {1,2,3} 

When  the  aircraft  operates  in  the  normal  flight  environment  it  uses  fuel  at  a  linear 
rate  of  r(3)  =  1.0  pound/minute.  When  the  F-16  is  functioning  in  a  ground  attack 
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role  the  engine  consumes  fuel  at  a  rate  of  r(l)  =  3.0  pounds/minute.  Finally,  the 
extreme  conditions  of  air-to-air  combat  require  the  engine  to  burn  fuel  at  a  rate 
of  r(2)  =  7.0  pounds/minute.  Furthermore,  the  time  the  aircraft  spends  in  flight 
envelope  f  G  S'  is  an  exponential  random  variable  with  intensity  /ij.  Therefore,  the 
stochastic  process  {Z{t)  :  t  >  0}  can  be  characterized  by  a  continuous-time  Markov 
chain  (CTMC),  whose  inhnitesimal  generator  matrix  is 


-10.0 

5.0 

5.0 

2.0 

-5.0 

3.0 

0.4 

0.6 

-1.0 

Additionally,  the  pilot  will  use  the  afterburner  at  random  intervals,  requiring 
fuel  to  be  dumped  into  the  engine  instantaneously.  The  afterburner  can  be  used  in 
any  flight  environment  and  is  assumed  independent  of  the  flight  environment.  The 
use  of  the  afterburner  can  be  thought  of  as  a  Poisson  shock  process  with  rate  0.01 
uses/minute.  The  magnitude  of  the  fuel  needed  is  a  gamma  random  variable  with 
parameters  (3  =  1  and  k  =  3,  such  that  the  cumulative  distribution  function  is 


k-l 


i-E 


iPyy 


r=0 

2 


r=0 


The  total  amount  of  fuel  consumed  up  to  time  t  is  denoted  by  the  random 
variable  X{t).  Finally,  assume  the  capacity  of  the  F-16’s  external  fuel  tanks  is  100 
pounds.  It  is  assumed  the  aircraft  has  an  equal  chance  of  beginning  flight  in  any 
of  the  three  modes.  The  random  variable  denotes  the  time  at  which  the  aircraft 
must  begin  using  its  internal  fuel  reserves.  The  probability  that  the  external  fuel 
tanks  are  insufficient  for  a  sortie  of  duration  t  is  Gioo(f)  =  -P{^ioo  <  t}-  The  function 
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Gioo(^)  can  be  considered  the  failure  time  distribution,  and  the  required  matrices  are 


3.0 

0.0 

0.0 

Rd  — 

0.0 

7.0 

0.0 

0.0 

0.0 

1.0 

and 


(  1-0 
Vl.O+u 

0.0 

0.0 

Fd(w)  = 

0.0 

( 

'  1.0 
.  l.O+M/ 

0.0 

0.0 

0.0 

(  1-0 
Vl.O+n 

a  = 

1/3 

1/3 

1/3  . 

Using  these  matrices  and  Equation  (3.14),  the  Laplace-Stieltjes  transform  of  the  fail¬ 
ure  time  distribution  was  obtained.  The  approximation  of  the  analytical  distribution 
is  shown  in  Table  4.3  as  is  the  empirical  distribution  generated  by  the  simulation. 
The  two  distributions  are  also  depicted  in  Figure  4.2. 

Next,  the  numerical  approximation  of  the  analytical  mean  and  variance  are 
compared  with  the  those  of  the  simulated  data.  The  mean  of  the  100,000  simulated 
failure  times  is 

E%]  =  47.844078, 


and  the  variance  is 

Var{f^)  =  23.749559. 

The  same  measures  were  calculated  numerically  by  inverting  the  analytical 
result  of  Equation  (3.30).  The  measures  appear  in  Table  4.4. 
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Figure  4.2  Cumulative  distribution  function  of  Tx  for  a  fuel  consumption  model. 

The  F-16  can  carry  external  fuel  tanks  to  extend  its  combat  capabilities.  Con¬ 
sider  a  sensor  that  updates  the  pilot  every  r  =  10.0  minutes  as  to  which  tank  the 
engine  is  using.  If  the  sensor  detects  the  internal  fuel  tanks  are  being  used,  the  pilot 
is  warned  and  he  lands  immediately  to  refuel.  The  long-run  availability  in  this  ex¬ 
ample  can  be  interpreted  as  the  probability  that  at  an  arbitrary  time  t  the  aircraft  is 
using  its  external  fuels  tanks.  The  long-run  availability  is  computed  using  Equation 
(3.41).  The  mean  analytical  failure  time  was  obtained  via  Equation  (3.30)  and  the 
long-run  availability  is 


A  =  0.905818. 

The  same  methods  employed  in  the  previous  example  were  used  to  calculate  the 
simulated  long-run  availability  for  this  and  all  remaining  examples.  The  results 
obtained  via  simulation  are  compared  to  the  analytical  solution  in  Table  4.4.  This 
example  illustrates  that  the  analytical  results  perform  well  when  applied  to  a  fuel 
consumption  model. 


4-10 


Table  4.3  Analytical  versus  empirical  CDFs  for  a  fuel  consumption  model. 


t 

Analytical 

Simulated 

Deviation 

t 

Analytical 

Simulated 

Deviation 

28 

0.0000053 

0.0000000 

0.0000053 

50 

0.6744067 

0.6727400 

0.0016667 

29 

0.0000156 

0.0000000 

0.0000156 

51 

0.7436441 

0.7431400 

0.0005041 

30 

0.0000466 

0.0000400 

0.0000066 

52 

0.8041333 

0.8041800 

0.0000467 

31 

0.0001250 

0.0001000 

0.0000250 

53 

0.8549205 

0.8542100 

0.0007105 

32 

0.0003077 

0.0002800 

0.0000277 

54 

0.8959107 

0.8954700 

0.0004407 

33 

0.0007082 

0.0007500 

0.0000418 

55 

0.9277190 

0.9267200 

0.0009990 

34 

0.0015347 

0.0015200 

0.0000147 

56 

0.9514543 

0.9503200 

0.0011343 

35 

0.0031397 

0.0033400 

0.0002003 

57 

0.9684865 

0.9676000 

0.0008865 

36 

0.0060782 

0.0062900 

0.0002118 

58 

0.9802400 

0.9794900 

0.0007500 

37 

0.0111620 

0.0111300 

0.0000320 

59 

0.9880394 

0.9879200 

0.0001194 

38 

0.0194913 

0.0190800 

0.0004113 

60 

0.9930157 

0.9930700 

0.0000543 

39 

0.0324401 

0.0323400 

0.0001001 

61 

0.9960678 

0.9962900 

0.0002222 

40 

0.0515734 

0.0518300 

0.0002566 

62 

0.9978670 

0.9979400 

0.0000730 

41 

0.0784851 

0.0787800 

0.0002949 

63 

0.9988859 

0.9989500 

0.0000641 

42 

0.1145669 

0.1147900 

0.0002231 

64 

0.9994401 

0.9994400 

0.0000001 

43 

0.1607379 

0.1628500 

0.0021121 

65 

0.9997295 

0.9997200 

0.0000095 

44 

0.2171889 

0.2188000 

0.0016111 

66 

0.9998745 

0.9998900 

0.0000155 

45 

0.2832023 

0.2823700 

0.0008323 

67 

0.9999441 

0.9999700 

0.0000259 

46 

0.3571021 

0.3564800 

0.0006221 

68 

0.9999761 

0.9999900 

0.0000139 

47 

0.4363623 

0.4359700 

0.0003923 

69 

0.9999902 

1.0000000 

0.0000098 

48 

0.5178656 

0.5157800 

0.0020856 

70 

0.9999962 

1.0000000 

0.0000038 

49 

0.5982695 

0.5957300 

0.0025395 

71 

1.0000000 

1.0000000 

0.0000000 

Table  4.4  Comparison  of  reliability  measures  for  a  fuel  consumption  model. 


Measure 

Analytical 

Simulation 

Deviation 

MTTF 

47.835850 

47.844078 

0.008228 

Variance 

23.634982 

23.749559 

0.114577 

Availability 

0.905818 

0.905740 

0.000078 
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4-4  Tire  Tread  Wear  Model 


The  useful  lifetime  of  a  tire  on  the  M-35  Diesel  Engine  Driven  (DED)  vehicle 
can  be  characterized  by  the  amount  of  tread  remaining.  Furthermore,  the  total  dam¬ 
age  done  to  the  tire  may  be  modelled  using  the  compound  damage  process  developed 
in  chapter  3.  Dehne  the  random  variable  Z{t)  to  be  the  driving  conditions  of  the 
vehicle  at  time  t.  Here  the  temporal  domain  is  not  dehned  in  terms  of  conventional 
time.  Instead,  it  is  dehned  in  terms  of  thousands  of  miles  driven,  such  that  1  unit 
of  “time”  is  1,000  miles.  The  state  space  of  environment  may  be  coded  as 

S  =  {transporting  cargo,  convoy,  garrison,  off-road,  towing}  =  (1,2,  3, 4,  5}. 

The  tread  wears  at  a  continuous  linear  rate  which  depends  solely  on  the  operating 
conditions  of  the  vehicle.  When  the  M-135  DED  operates  in  state  i  E  S,  the  tread 
will  wear  at  rate  r{i)  mm/1,000  mi.  The  wear  parameters  for  this  problem  are  in 
the  following  vector 


r=[l.O  2.0  3.0  4.0  10.0  ]• 

Furthermore,  the  time  the  vehicle  operates  in  state  i  is  an  exponential  random  vari¬ 
able  with  rate  /ij.  Therefore,  the  inhnitesimal  generator  matrix  is 


-0.500 

0.125 

0.125 

0.125 

0.125 

0.400 

-2.000 

0.400 

0.600 

0.600 

0.025 

0.025 

-0.100 

0.025 

0.025 

0.050 

0.050 

0.050 

-0.200 

0.050 

1.500 

1.000 

1.000 

1.500 

-5.000 

Additionally,  the  tread  experiences  random  shocks,  possibly  caused  by  harsh 
breaking  or  harsh  road  conditions.  This  damage  process  is  independent  of  the  wear 
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process.  The  shock  process  can  be  characterized  as  a  Poisson  shock  process  with 
intensity  A  =  0.25  shocks  per  1,000  miles.  The  magnitude  of  each  shock  is  distributed 
according  to  a  gamma  probability  density  function  with  parameters  (3  =  5  and  k  =  8. 
The  total  damage  to  the  tire  is  given  by  the  random  variable  X{t)  =  W{t)  +  y{t), 
where  W (t)  is  the  amount  of  wear  at  time  t  and  Y (t)  is  the  total  damage  incurred 
by  shocks.  Finally,  assume  the  tire  is  considered  unsafe  to  drive  after  the  tread  has 
worn  a  total  of  a;  =  100  mm.  The  initial  probability  vector  for  the  M-35  DED  is 

a  =  [  0.20  0.20  0.20  0.20  0.20  ]• 

The  remaining  necessary  matrices  are: 

1.0  0.0  0.0  0.0  0.0 

0.0  2.0  0.0  0.0  0.0 

Rd  =  0.0  0.0  3.0  0.0  0.0  , 

0.0  0.0  0.0  4.0  0.0 

0.0  0.0  0.0  0.0  10.0 

and 

(olh)*  oo  oo  O'O  00 

00  (iiS:)*  0.0  0.0  0.0 

Fd(u)  =  0.0  0.0  (oH;)'’  0.0  0.0 

00  0.0  0.0  0.0 
0.0  0.0  0.0  0.0  _ 
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Using  these  matrices  and  Equation  (3.14),  the  Laplace-Stieltjes  transform  of  the 
failure  time  distribution  was  obtained.  The  approximation  of  the  analytical  distri¬ 
bution  is  shown  in  Table  4.5  as  well  as  the  empirical  distribution  generated  by  the 
simulation.  The  two  distributions  are  also  depicted  in  Figure  4.3.  The  analytical 
results  closely  match  those  obtained  by  simulation. 


Table  4.5  Analytical  versus  empirical  CDFs  for  a  tread  wear  model. 


t 

Analytical 

Simulation 

Deviation 

t 

Analytical 

Simulation 

Deviation 

0 

0.000000 

0.000000 

0.000000 

19 

0.966844 

0.967500 

0.000656 

1 

0.006711 

0.006860 

0.000149 

20 

0.976395 

0.977150 

0.000755 

2 

0.030664 

0.030280 

0.000384 

21 

0.983424 

0.983870 

0.000446 

3 

0.074461 

0.073900 

0.000561 

22 

0.988502 

0.988430 

0.000072 

4 

0.136686 

0.135750 

0.000936 

23 

0.992140 

0.991780 

0.000360 

5 

0.213330 

0.211120 

0.002210 

24 

0.994662 

0.994310 

0.000352 

6 

0.299173 

0.296940 

0.002233 

25 

0.996386 

0.996150 

0.000236 

7 

0.388842 

0.387790 

0.001052 

26 

0.997571 

0.997310 

0.000261 

8 

0.477645 

0.475600 

0.002045 

27 

0.998370 

0.998200 

0.000170 

9 

0.561752 

0.559650 

0.002102 

28 

0.998902 

0.998670 

0.000232 

10 

0.638645 

0.636260 

0.002385 

29 

0.999268 

0.999130 

0.000138 

11 

0.706847 

0.704620 

0.002227 

30 

0.999522 

0.999470 

0.000052 

12 

0.765582 

0.763230 

0.002352 

31 

0.999693 

0.999630 

0.000063 

13 

0.815315 

0.815150 

0.000165 

32 

0.999810 

0.999760 

0.000050 

14 

0.856392 

0.856310 

0.000082 

33 

0.999891 

0.999840 

0.000051 

15 

0.889820 

0.890290 

0.000470 

34 

0.999940 

0.999910 

0.000030 

16 

0.916613 

0.917310 

0.000697 

35 

0.999968 

0.999960 

0.000008 

17 

0.937790 

0.938330 

0.000540 

36 

0.999983 

0.999980 

0.000003 

18 

0.954269 

0.954750 

0.000481 

37 

0.999991 

1.000000 

0.000009 
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Figure  4.3  Cumulative  distribution  function  of  Tx  for  a  tread  wear  model. 

Next,  the  numerical  approximation  of  the  mean  and  variance  are  compared 
with  that  of  the  simulated  data.  The  mean  simulated  time  to  failure  is 

E[fx]  =  8.955105,  (4.6) 

and  the  variance  of  the  failure  times  is 

Var{fx)  =  22.103945.  (4.7) 

The  same  measures  are  calculated  by  numerically  inverting  the  analytical  result  of 
Equation  (3.30).  They  appear  in  Table  4.6. 

Finally,  the  long-run  availability  is  considered  under  the  manufacturer’s  main¬ 
tenance  policy.  The  tire  manufacturer  recommends  inspecting  the  tires  every  5,000 
miles  and  replacing  tires  that  have  exceeded  100  mm  of  total  wear.  The  amount 
of  time  driven  on  safe  tires  can  be  considered  the  long-run  availability  of  the  tire 
under  such  a  maintenance  policy.  The  long-run  availability  is  computed  for  this 
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inspect-and-replace  policy  using  Equation  (3.41) 


A  =  0.782018. 

The  simulated  data  was  used  to  calculate  the  average  availability  over  the  duration 
of  the  simulation, 

A  =  0.782242. 

It  is  compared  to  the  analytical  solution  in  Table  4.6. 

Table  4.6  Comparison  of  reliability  measures  for  tire  tread. 


Measure 

Analytical 

Simulation 

Deviation 

MTTF 

8.938213 

8.955105 

0.016892 

Variance 

22.137441 

22.103945 

0.033496 

Availability 

0.782018 

0.782242 

0.000224 

4-5  Lifetime  of  an  Electric  Circuit 

This  section  considers  an  electric  circuit  whose  operating  environment  consists 
of  10  states,  S  =  {1,  2, ...,  10}.  The  circuit  is  installed  in  an  unit  whose  environment 
at  time  t,  Z{t),  can  be  characterized  by  a  CTMC  whose  inhnitesimal  generator 
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matrix  is 


Q  = 


-1.900 

0.190 

0.220 

-1.100 

0.027 

0.000 

0.085 

0.170 

0.080 

0.000 

0.400 

0.000 

0.120 

0.120 

0.100 

0.000 

1.530 

0.000 

0.040 

0.040 

0.190 

0.190 

0.220 

0.000 

-0.900 

0.603 

0.680 

-1.700 

0.000 

0.000 

0.000 

0.000 

0.120 

0.120 

0.000 

0.150 

0.000 

0.000 

0.040 

0.040 

0.190 

0.190 

0.220 

0.000 

0.000 

0.270 

0.340 

0.000 

-1.600 

0.320 

0.000 

-2.000 

0.120 

0.120 

0.000 

0.000 

0.000 

0.000 

0.040 

0.040 

0.190 

0.190 

0.000 

0.000 

0.000 

0.000 

0.425 

0.000 

0.080 

0.240 

0.000 

0.000 

-1.200 

0.120 

0.000 

-0.500 

0.000 

0.000 

0.080 

0.080 

0.190 

0.380 

0.220 

0.220 

0.000 

0.000 

0.000 

0.000 

0.320 

0.560 

0.000 

1.600 

0.000 

0.360 

0.100 

0.150 

-1.700 

0.170 

0.400 

-0.800 

Furthermore,  the  circuit  is  known  through  experimental  data  to  wear  at  rate  r{i) / day, 
when  the  ambient  environment  is  in  state  i  E  S.  The  appropriate  values  are  con¬ 
tained  on  the  diagonal  of  R^).  Additionally,  the  circuit  also  incurs  damage  due  to 
the  influence  of  random  shocks.  These  shocks  occur  according  to  a  Poisson  process 
with  intensity  A  =  0.25  shocks/day,  and  magnitudes  that  are  distributed  uniformly 
over  the  interval  [0,  5],  so  that  the  proper  F £,  matrix  is 
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g{u)  000000000 

0  g{u)  00000000 

0  0  g{u)  0  0  0  0  0  0  0 

0  0  0  g{u)  0  0  0  0  0  0 

0  0  0  0  g{u)  0  0  0  0  0 

Fd(m)  =  , 

0  0  0  0  0  g{u)  0  0  0  0 

0  0  0  0  0  0  g{u)  0  0  0 

0  0  0  0  0  0  0  g{u)  0  0 

00000000  g{u)  0 

000000000  g{u) 

where 

Finally,  the  initial  probability  vector  is 

a=[0.80  0.05  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.15]- 

The  circuit  is  assumed  to  fail  when  the  total  damage  incurred  from  wear  and 
shocks  exceeds  20  units  of  damage.  Using  the  appropriate  matrices  and  Equation 
(3.14)  the  analytical  distribution  was  approximated  using  the  algorithm  of  [1].  The 
analytical  CDF  was  compared  to  the  empirical  CDF  in  Table  4.7  and  Figure  4.4. 
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Table  4.7  Analytical  versus  empirical  CDFs  for  an  electrical  circuit. 


t 

Analytical 

Simulation 

Deviation 

t 

Analytical 

Simulation 

Deviation 

0 

0.000000 

0.000000 

0.000000 

23 

0.848646 

0.848710 

0.000064 

1 

0.000000 

0.000000 

0.000000 

24 

0.886360 

0.885550 

0.000810 

2 

0.000002 

0.000000 

0.000002 

25 

0.916610 

0.915870 

0.000740 

3 

0.000072 

0.000060 

0.000012 

26 

0.940221 

0.939950 

0.000271 

4 

0.000329 

0.000290 

0.000039 

27 

0.958158 

0.958060 

0.000098 

5 

0.001210 

0.001160 

0.000050 

28 

0.971418 

0.971290 

0.000128 

6 

0.003549 

0.003450 

0.000099 

29 

0.980954 

0.981150 

0.000196 

7 

0.008523 

0.008630 

0.000107 

30 

0.987627 

0.987620 

0.000007 

8 

0.017683 

0.018140 

0.000457 

31 

0.992169 

0.992090 

0.000079 

9 

0.032937 

0.033740 

0.000803 

32 

0.995176 

0.995410 

0.000234 

10 

0.056102 

0.057120 

0.001018 

33 

0.997111 

0.997330 

0.000219 

11 

0.088610 

0.089870 

0.001260 

34 

0.998321 

0.998290 

0.000031 

12 

0.131316 

0.133160 

0.001844 

35 

0.999054 

0.999030 

0.000024 

13 

0.184232 

0.185820 

0.001588 

36 

0.999484 

0.999450 

0.000034 

14 

0.246409 

0.247350 

0.000941 

37 

0.999728 

0.999740 

0.000012 

15 

0.316076 

0.317220 

0.001144 

38 

0.999862 

0.999840 

0.000022 

16 

0.390865 

0.391260 

0.000395 

39 

0.999932 

0.999900 

0.000032 

17 

0.468053 

0.468300 

0.000247 

40 

0.999968 

0.999950 

0.000018 

18 

0.544842 

0.544110 

0.000732 

41 

0.999985 

0.999980 

0.000005 

19 

0.618626 

0.618940 

0.000314 

42 

0.999993 

0.999990 

0.000003 

20 

0.687212 

0.688090 

0.000878 

43 

0.999997 

0.999990 

0.000007 

21 

0.748969 

0.748620 

0.000349 

44 

0.999999 

0.999990 

0.000009 

22 

0.802903 

0.802630 

0.000273 

45 

1.000000 

1.000000 

0.000000 

The  mean  and  variance  were  computed  analytically  using  Equation(3.30)  and 
are  compared  to  the  simulated  data  in  Table  4.8.  Next,  the  long-run  availability  of 
the  circuit  is  considered.  The  electric  circuit  is  assumed  to  be  inspected  everyday. 
If  the  circuit  has  failed,  then  it  is  replaced  with  a  new  component.  Otherwise  no 
action  is  taken.  The  availability  of  the  system  is  calculated  using  Equation  (3.41) 
with  the  following  parameters,  r  =  1.0  and  7  =  134.0. 
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Figure  4.4  Cumulative  distribution  function  of  for  an  electrical  circuit. 


Table  4.8  Comparison  of  reliability  measures  for  an  electrical  circuit. 


Measure 

Analytical 

Simulation 

Deviation 

MTTF 

17.680713 

17.712524 

0.031811 

Variance 

25.940322 

25.510923 

0.429399 

Availability 

0.972498 

0.972494 

0.000004 

4-6  Failure  Dynamics  of  a  Birth- and- Death  Process 

In  this  section  the  operating  cost  of  a  manufacturing  plant  are  considered. 
The  manufacturing  operation  is  as  follows.  Customers  place  orders  with  the  com¬ 
pany  according  to  a  stochastic  arrival  process.  The  plant  has  a  limited  storage  and 
production  capacity.  Only  one  order  may  be  processed  at  a  time,  and  up  to  an  addi¬ 
tional  29  orders  may  be  stored  in  the  plant’s  holding  facility.  The  company  receives 
orders  from  customers  at  an  effective  stationary  rate  of  0.8  orders/day,  and  is  able 
to  £11  those  orders  at  a  rate  of  1.2  orders/day.  The  time  between  successive  order 
arrivals  is  an  exponential  random  variable,  as  is  the  time  to  process  an  order.  Define 
the  random  variable  Z{t)  to  be  the  number  of  orders  in  the  system  at  time  t.  The 
{Z{t)  :  t  >  0}  process  can  be  characterized  as  a  birth-and-death  process,  which  is  a 
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specific  type  of  CTMC.  The  infinitesimal  generator  matrix  is  a  tri-diagonal,  31  x  31 
matrix  whose  entries  are 


-0.80 

i  =  0,  j  = 

0 

0.80 

*  =  J  -  1, 

j  = 

1,.., 

.,30 

1.20 

i  =  j  +  1, 

j  = 

0,... 

,,29 

-2.00 

i  =  h  j  = 

1,.., 

.,29 

-1.20 

i  =  30,  j  -- 

=  30 

0.00 

otherwise 

The  operating  cost  of  the  plant  depends  on  the  number  of  orders  in  the  system 
and  are  given  in  thousands  of  dollars.  It  costs  r{i)  dollars/day  to  operate  the  plant  if 
there  are  i  G  S'  =  {0, 1, 30}  orders  in  the  system.  If  there  are  10  orders  or  less  in 


the  system,  then  r{i)  = 

$0.1, 

i  =  0,1, 

,...,10. 

If  there  are  m 

system,  then  the  cost  is 

$0.2 

dollars/day.  The  Ri?  matrix 

0.10 

J  =  0,...,10 

.  =  < 
yj 

0.20 

j  =  ll,...,30 

0.00 

j 

The  plant  also  incurs  other  random  costs  associated  with  operating  a  business. 
These  costs  can  be  characterized  by  a  Poisson  shock  process,  where  shocks  arrive  at 
a  rate  of  1.0  shock/day  and  each  shock  has  a  random  magnitude  whose  distribution 
is  an  exponential  random  variable  with  a  mean  of  $5.00.  The  F £,  matrix  is  a  31  x  31 
diagonal  matrix  such  that 


0.2 

0.2+it 


i=j,  j=0,...,30 


0,  otherwise 
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Finally,  the  system  is  assumed,  with  probability  1,  to  always  begin  empty.  Hence, 


\  I  i  =  l 

Oii  \ 

I  0  otherwise. 

The  time  until  the  cost  to  operate  the  plant  exceeds  the  threshold  value  a;  =  1.0 
is  a  hrst  passage  time,  and  its  distribution  is  calculated  using  the  results  derived  in 
chapter  3.  Using  Equation  (3.14)  and  the  preceding  matrices,  a  numeric  approxima¬ 
tion  of  the  analytical  CDF  is  calculated  and  displayed  in  Table  4.9.  For  comparison, 
the  empirical  distribution  found  by  simulation  is  also  in  Table  4.9.  The  two  distri¬ 
butions  are  compared  graphically  in  Figure  4.5. 


Figure  4.5  Cumulative  distribution  function  of  for  a  birth-and-death  process. 
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Table  4.9  Analytical  versus  empirical  CDFs  for  a  birth-and-death  process. 


t 

Analytical 

Simulation 

Deviation 

t 

Analytical 

Simulation 

Deviation 

0.00 

0.000000 

0.000000 

0.000000 

5.20 

0.991673 

0.991810 

0.000137 

0.20 

0.151847 

0.151240 

0.000607 

5.40 

0.993117 

0.993160 

0.000043 

0.40 

0.281962 

0.279870 

0.002092 

5.60 

0.994218 

0.994320 

0.000102 

0.60 

0.393193 

0.393580 

0.000387 

5.80 

0.995274 

0.995340 

0.000066 

0.80 

0.488068 

0.487630 

0.000438 

6.00 

0.996285 

0.996220 

0.000065 

1.00 

0.568822 

0.568610 

0.000212 

6.20 

0.997048 

0.996870 

0.000178 

1.20 

0.637418 

0.637680 

0.000262 

6.40 

0.997522 

0.997320 

0.000202 

1.40 

0.695573 

0.696840 

0.001267 

6.60 

0.997897 

0.997740 

0.000157 

1.60 

0.744786 

0.745450 

0.000664 

6.80 

0.998318 

0.998190 

0.000128 

1.80 

0.786357 

0.787770 

0.001413 

7.00 

0.998719 

0.998460 

0.000259 

2.00 

0.821416 

0.822240 

0.000824 

7.20 

0.998983 

0.998750 

0.000233 

2.20 

0.850937 

0.852230 

0.001293 

7.40 

0.999128 

0.999050 

0.000078 

2.40 

0.875750 

0.876250 

0.000500 

7.60 

0.999262 

0.999230 

0.000032 

2.60 

0.896553 

0.897280 

0.000727 

7.80 

0.999429 

0.999310 

0.000119 

2.80 

0.913976 

0.914760 

0.000784 

8.00 

0.999576 

0.999390 

0.000186 

3.00 

0.928594 

0.929300 

0.000706 

8.20 

0.999658 

0.999470 

0.000188 

3.20 

0.940851 

0.941610 

0.000759 

8.40 

0.999702 

0.999550 

0.000152 

3.40 

0.951031 

0.951620 

0.000589 

8.60 

0.999752 

0.999630 

0.000122 

3.60 

0.959413 

0.960310 

0.000897 

8.80 

0.999816 

0.999710 

0.000106 

3.80 

0.966406 

0.966870 

0.000464 

9.00 

0.999866 

0.999770 

0.000096 

4.00 

0.972365 

0.972850 

0.000485 

9.20 

0.999889 

0.999810 

0.000079 

4.20 

0.977366 

0.977730 

0.000364 

9.40 

0.999900 

0.999850 

0.000050 

4.40 

0.981352 

0.981660 

0.000308 

9.60 

0.999919 

0.999860 

0.000059 

4.60 

0.984506 

0.985040 

0.000534 

9.80 

0.999950 

0.999890 

0.000060 

4.80 

0.987219 

0.987780 

0.000561 

10.00 

0.999979 

1.000000 

0.000021 

5.00 

0.989665 

0.990030 

0.000365 

10.20 

1.000000 

1.000000 

0.000000 

A  numeric  approximation  of  the  mean  and  variance  of  the  hrst  passage  time 
are  calculated  using  the  analytical  result  in  Equation  (3.30).  The  analytical  solutions 
are  compared  to  those  measures  found  via  simulation  in  Table  4.10. 

Finally,  if  the  operating  costs  exceed  the  threshold,  the  plant  will  effectively 
shut  down.  The  company’s  accountant  reviews  the  expenditures  at  the  end  of  each 
day  and  if  the  operating  costs  have  exceeded  the  threshold,  then  she  immediately 
pays  the  outstanding  debts  and  the  process  renews.  Therefore,  the  necessary  pa- 
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rameters  are  r  =  1.0  and  7  =  10.0.  Using  the  simulated  data,  the  average  time 
the  accountant  discovers  an  overdraft  is  E\T]  =  1.72313  days,  and  the  average  time 
the  overdraft  occurs  is  E\T^  =  1.155588  days.  Equation  (4.5)  implies  the  long-run 
simulated  availability  is 


A 


1.155588 

1.72313 


0.670633. 


Using  the  analytical  results  of  Equation  (3.41),  the  long-run  availability  is  calculated. 


A  =  0.670967. 


The  simulated  long-run  availability  and  numerical  approximation  of  the  analytical 
solution  are  compared  in  Table  4.10. 

Table  4.10  Comparison  of  reliability  measures  for  a  birth- and-death  process. 


Measure 

Analytical 

Simulation 

Deviation 

MTTF 

1.157211 

1.155588 

0.001623 

Variance 

1.226178 

1.222159 

0.004019 

Availability 

0.670967 

0.670633 

0.000334 

This  example  demonstrates  that  the  analysis  of  the  reliability  model  is  appropriate 
for  systems  whose  environmental  process  includes  a  large  number  of  states. 
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4 . 7  Summary 


In  this  chapter  the  techniques  and  results  of  chapter  3  were  applied  to  com¬ 
pute  the  relevant  reliability  measures  of  various  types  of  systems.  Although  these 
examples  are  somewhat  contrived,  they  demonstrate  the  wide  variety  of  problems 
that  may  be  solved  using  the  technique.  More  importantly,  they  demonstrate  that 
if  the  real-world  system  can  be  modelled  appropriately,  the  analysis  of  this  thesis 
provides  accurate  results  without  requiring  lengthy  and  time-consuming  simulation 
runs. 

More  specihcally,  the  numerical  approximations  obtained  with  the  algorithm 
of  Abate  and  Whitt  [1]  closely  match  the  results  obtained  via  simulation.  The  max¬ 
imum  deviation  in  probability  for  these  hve  examples  was  on  the  order  of  2.5  x  10“^. 
However,  the  computation  time  needed  to  obtain  the  analytical  results  was  orders 
of  magnitude  shorter  than  obtaining  the  solutions  via  simulation.  The  average  time 
required  for  numerical  inversion  was  only  a  few  seconds,  while  the  average  time 
required  to  complete  the  simulation  experiment  was  4  hours.  The  computational  ex¬ 
pedience  of  the  analytical  solution  makes  it  possible  to  quickly  compare  the  reliability 
of  competing  maintenance  policies. 

Although,  the  analytical  solutions  of  chapter  3  are  appealing,  they  do  have 
minor  limitations.  The  system  to  be  analyzed  must  be  modelled  in  the  manner 
described  in  chapter  3.  To  do  this,  various  parameters  (transition  rates,  degradation 
rates,  etc.)  must  be  known  or  statistically  estimated.  Often,  it  is  impossible  to  know 
the  probability  distribution  of  these  parameters.  If  no  information  is  available,  then 
a  hypothesized  distribution  must  be  used.  Using  observational  data  to  estimate  the 
parameters  is  a  valid  method  but  may  require  a  large  number  of  observations  to 
obtain  acceptable  estimates.  However,  the  same  information  is  required  to  create  a 
simulation,  or  any  other  model  used  to  analyze  the  system  described  in  this  thesis. 
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The  most  restrictive  assumption  of  this  thesis  is  that  the  dynamics  of  the  am¬ 
bient  environment  form  a  CTMC.  This  assumption  requires  that  state  sojourn  times 
be  exponentially  distributed.  For  some  environments  this  Markovian  assumption  is 
not  valid,  and  the  numerical  results  may  not  be  representative  of  reality.  However, 
techniques  exist  to  incorporate  the  Markovian  property  in  a  non-Markovian  envi¬ 
ronment.  To  do  this  the  distributions  of  the  sojourn  times  must  be  approximated 
by  phase-type  distributions  (cf.  Neuts  [28]).  Because  phase-type  distributions  are 
constructed  using  phases  of  exponential  distributions  the  Markovian  property  is  re¬ 
tained.  Other  possible  extensions  of  this  research  are  discussed  in  the  next  and  hnal 
chapter. 


4-26 


5.  Conclusions  and  Future  Research 


The  field  of  reliability  is  concerned  with  understanding  and  improving  system 
performance;  however,  most  reliability  models  fail  to  capture  the  effect  of  a  system’s 
operating  environment  on  its  useful  lifetime.  Those  models  rely  on  observation  of 
failure  times  collected  in  static  laboratory  environments.  These  shortcomings  lead 
researchers  in  the  field  to  develop  a  new  class  of  state-dependent  reliability  mod¬ 
els  which  characterize  the  dependence  of  the  cumulative  damage  incurred  by  the 
unit  on  the  system’s  operating  environment.  Incorporating  the  ambient  environ¬ 
ment  produces  an  analytical  model  that  more  aptly  characterizes  the  system’s  fail¬ 
ure  mechanism.  However,  state-dependent  reliability  models  require  more  advanced 
analysis  techniques,  and  often  lead  to  intractable  solutions.  The  existing  analysis  of 
compound  damage  models  require  computationally  expensive  and  unstable,  multidi¬ 
mensional  inversion  algorithms  to  provide  usable  solutions.  This  thesis  provides  an 
analysis  of  a  particular  state-dependent  reliability  model.  The  main  results  provide 
closed-form  analytical  solutions  for  a  class  of  compound  damage  models  that  may 
be  readily  implemented  without  such  cumbersome  methods. 

An  appropriate  mathematical  model  was  first  constructed  and  then  analyzed. 
The  failure  time  distribution  of  a  system  subject  to  state-dependent  wear  and  a 
Poisson  shock  process  was  derived  as  a  two-dimensional  Laplace-Stieltjes  transform. 
This  was  accomplished  by  proving  that  the  joint  distribution  of  the  cumulative  dam¬ 
age  process  and  the  environmental  state  satisfies  a  system  of  partial  differential 
equations,  and  then  solving  the  system  using  transform  methods.  Next,  the  two- 
dimensional  result  was  reduced  to  a  one-dimensional  Laplace-Stieltjes  transform  by 
converting  the  original  system  of  partial  differential  equations  into  a  system  of  or¬ 
dinary  differential  equations  that  are  solved  using  standard  techniques.  Next,  the 
moments  of  the  distribution  were  derived  using  elementary  matrix  calculus.  Finally, 
the  long-run  availability  of  the  system  under  an  inspect-and-replace  policy  was  de- 
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rived  using  the  theory  of  regenerative  processes.  All  of  the  results  derived  in  this 
thesis  offer  a  signihcant  advantage  over  existing  solutions:  they  may  be  implemented 
using  simple,  efficient  one-dimensional  numerical  inversion  algorithms. 

The  hnal  portion  of  this  research  effort  illustrated  the  analytical  solutions 
through  numerical  examples.  First,  the  analytical  results  of  chapter  3  were  used  to 
compute  various  system  reliability  measures.  For  each  system,  the  necessary  problem 
parameters  were  dehned  and  the  analytical  solutions  were  approximated  numerically 
via  the  one-dimensional  inversion  algorithm.  Approximate  solutions  were  also  ob¬ 
tained  through  computer  simulation.  The  analytical  and  simulated  solutions  were 
compared  demonstrating  the  accuracy  of  the  reliability  measures  obtained  by  the  an¬ 
alytical  result.  Moreover,  those  reliability  measures  may  be  obtained  more  efficiently. 
In  fact,  the  computation  time  is  orders  of  magnitude  smaller  for  the  analytical  results 
(a  few  seconds  versus  several  hours). 

To  implement  the  techniques  of  this  thesis,  one  must  know  the  various  problem 
parameters  (transition  rates,  shock  rates,  shock  magnitudes,  and  wear  rates,  etc.). 
If  there  is  no  a  priori  information  about  the  distribution  of  these  parameters,  then  a 
hypothesized  distribution  must  be  used.  One  may  observe  the  system,  noting  transi¬ 
tion  rates,  wear  rates,  shock  times,  and  shock  magnitudes.  Using  this  observational 
data  (possibly  obtained  from  sensors)  one  may  estimate  the  appropriate  parameters. 
Although  using  observational  data  is  a  valid  technique,  it  may  require  a  lengthy 
observation  period.  However,  other  methods  used  to  measure  the  reliability  of  the 
system  would  be  subject  to  the  same  limitations.  The  most  restrictive  assumption 
of  this  thesis,  however,  is  that  the  dynamics  of  the  ambient  environment  must  be 
modelled  as  a  continuous-time  Markov  chain;  the  analysis  cannot  be  applied  directly 
to  a  non-Markovian  system.  However,  techniques  exist  to  convert  semi-Markov  pro¬ 
cesses  into  CTMCs  using  phase-type  distributions.  It  may  be  possible  to  use  these 
techniques  to  apply  the  analysis  of  this  model  to  a  system  with  a  non-Markovian 
environmental  process. 
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As  with  all  academic  endeavors,  this  thesis  builds  on  previous  work  and  will 
serve  as  a  foundation  for  future  research.  The  main  results  of  this  thesis  may  be 
extended  in  several  ways.  First,  performing  a  parameter  analysis  would  also  be  a 
fruitful  area  of  research.  If  some  of  the  parameters  can  be  controlled  in  the  real 
system,  then  a  parameter  analysis  might  identify  the  critical  factors  of  a  system’s 
reliability.  For  example,  consider  two  system  designs:  one  system  design  reduces 
the  rate  at  which  the  systems  experiences  wear  by  a  factor  of  0.5,  and  the  second 
abates  the  mean  damage  done  by  shocks  by  a  factor  of  0.5.  A  rigorous  analysis 
may  determine  which  system  is  more  reliable.  Moreover,  the  analysis  of  this  thesis 
only  considered  a  system  subject  to  a  single  shock  process.  It  is  possible  to  analyze  a 
system  subject  to  multiple  shock  processes  using  similar  techniques.  Perhaps  another 
interesting  model  would  be  one  in  which  both  the  wear  process  and  shock  process 
depend  on  the  state  of  the  ambient  environment. 

In  conclusion,  predicting  a  system’s  reliability  through  the  analysis  of  a  stochas¬ 
tic  reliability  model  is  not  a  new  problem.  The  classical  approach  has  been  to  observe 
previous  failures  without  regard  to  the  system’s  ambient  environment.  However,  re¬ 
cent  developments  in  the  held  of  reliability  theory  have  emphasized  the  need  to 
incorporate  the  effects  of  the  system’s  operating  environment  as  well.  By  utilizing 
the  resulting  bivariate  process,  this  thesis  contributes  an  analysis  of  the  failure  dy¬ 
namics  of  a  system  subject  to  a  state-dependent  wear  process  and  a  Poisson  shock 
process.  However,  unlike  previous  works  this  analysis  produces  analytical  results 
that  may  be  easily  implemented  to  solve  real-world  problem. 
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